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USE OF ANTHOCYANIN GENES TO MAINTAIN MALE STERILE PLANTS 
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The present invention relates to a method to maintain 
male-sterile plants that can be used for the production 
of hybrid seed of a plant crop species, to transgenic 
inbred plants that can be used in such process, and to 
chimeric genes that can be used to produce such 
transgenic inbred plants. 

Background to the Invention 

) 

In many, if not most plant species, the development 
of hybrid cultivars is highly desired because of their 
generally increased productivity due to heterosis: the 
superiority of performance of hybrid individuals compared 
with their parents (see e.g. Pehr, 1987, Principles of 
cultivar development, Volume 1 : Theory and Technique, 
MacMillan Publishing Company, New York; Allard, 1960, 
Principles of Plant Breeding, John Wiley and Sons, Inc.). 

The development of hybrid cultivars of various plant 
species depends upon the capability of achieving 
essentially almost complete cross-pollination between 
parents. This is most simply achieved by rendering one of 
the parent lines male sterile (i.e. bringing them in a 
condition so that pollen is absent or nonfunctional) 
either manually, by removing the anthers, or genetically 
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by using, in the one parent, cytoplasmic or nuclear genes 
that prevent anther and/or pollen development (for a 
review of the genetics of male sterility in plants see 
Kaul, 1988, 'Male Sterility in Higher Plants', Springer 
Verlag) . 

For hybrid plants where the seed is the harvested 
product (e.g. corn, oilseed rape) it is in most cases 
also necessary to ensure that fertility of the hybrid 
plants is fully restored. In systems in which the male 
sterility is under genetic control this requires the 
existence and use of genes that can restore male 
fertility. The development of hybrid cultivars is mainly 
dependent on the availability of suitable and effective 
sterility and restorer genes. 

Endogenous nuclear loci are known for most plant 
species that may contain genotypes which effect male 
sterility, and generally, such loci need to be homozygous 
for particular recessive alleles in order to result in a 
male-sterile phenotype. The presence of a dominant 'male 
fertile 1 allele at such loci results in male fertility. 

Recently it has been shown that male sterility can be 
induced in a plant by providing the genome of the plant 
with a chimeric male-sterility gene comprising a DNA 
sequence (or male-sterility DNA) coding, for example, for 
a cytotoxic product (such as an RNase) and under the 
control of a promoter which is predominantly active in 
selected tissue of the male reproductive organs. In this 
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regard stamen-specific promoters, such as the promoter of 
the TA29 gene of Nicotia na tabacum . have been shown to be 
particularly useful for this purpose (Mariani et al., 
1990, Nature 347:737, European patent publication ("EP") 
0,34 4,029). By providing the nuclear genome of the plant 
with such a male-sterility gene, an artificial male- 
sterility locus is created containing the artificial 
male- sterility genotype that results in a male-sterile 
plant. 

In addition it has been shown that male fertility can 
be restored to the plant with a chimeric fertility- 
restorer gene comprising another DNA sequence (or 
fertility-restorer DNA) that codes, for example, for a 
protein that inhibits the activity of the cytotoxic 
product or otherwise prevents the cytotoxic product from 
being active in the plant cells (European patent 
publication «EP« 0,412,911). For example the barnase gene 
of Bacillus amylolicruefaciens codes for an RNase, called 
barnase, which can be inhibited by a protein, barstar, 
20 that is encoded by the barstar gene of 

amvloliquefaciens . The barnase gene can be used for the 
construction of a sterility gene while the barstar gene 
can be used for the construction of a fertility-restorer 
gene. Experiments in different plant species, e.g. 
oilseed rape, have shown that a chimeric barstar gene can 
fully restore the male fertility of male sterile lines in 
which the male sterility was due to the presence of a 
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chimeric barnase gene (EP 0,412,911, Mariani et al., 
1991, Proceedings of the CCIRC Rapeseed Congress, July 9- 
11 , 1991, Saskatoon, Saskatchewan, Canada; Mariani et 
al., 1992, Nature 357:384), By coupling a marker gene, 
such as a dominant herbicide resistance gene (for example 
the bar gene coding for phosphinothricin acetyl 
transferase (PAT) that converts the herbicidal 
phosphinothricin to a non-toxic compound [De Block et 
al., 1987, EMBO J. 6:2513]), to the chimeric male- 
sterility and/or fertility-restorer gene, breeding 
systems can be implemented to select for uniform 
populations of male sterile plants (EP 0,344,029; EP 
0,412,911) . 

The production of hybrid seed of any particular 
cultivar of a plant species requires the: 1) maintenance 
of small quantities of pure seed of each inbred parent, 
and 2) the preparation of larger quantities of seed of 
each inbred parent. Such larger quantities of seed would 
normally be obtained by several (usually two) seed 
multiplication rounds, starting from a small quantity of 
pure seed ("basic seed") and leading, in each 
multiplication round, to a larger quantity of pure seed 
of the inbred parent and then finally to a stock of seed 
of the inbred parent (the "parent seed" or "foundation 
seed") which is of sufficient quantity to be planted to 
produce the desired quantities of hybrid seed. Of course, 
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in each seed multiplication round larger planting areas 
(fields) are required. 

In order to maintain and enlarge a small stock of 
seeds that can give rise to male-sterile plants it is 
necessary to cross the male sterile plants with normal 
pollen-producing parent plants, m the case in which the 
male-sterility is encoded in the nuclear genome, the 
offspring of such cross will in all cases be a mixture of 
male-sterile and male-fertile plants and the latter have 
to be removed from the former, with male-sterile plants 
containing an artificial male-sterility locus as 
described above, such removal can be facilitated by 
genetically linking the chimeric male sterility gene to a 
suitable marker gene, such as the bar gene, which allows 
the easy identification and removal of male-fertile 
plants (e.g. by spraying of an appropriate herbicide) . 

However, even when suitable marker genes are linked 
to male-sterility genotypes, the maintenance of parent 
male- sterile plants still requires at each generation 
the removal from the field of a substantial number of 
plants. For instance in systems using a herbicide 
resistance gene (e.g. the bar gene) linked to a chimeric 
male-sterility gene, as outlined above, only half of the 
parent stock will result in male- sterile plants, thus 
requiring the removal of the male-fertile plants by 
herbicide spraying prior to flowering. In any given 
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field, the removal of male-fertile plants effectively 
reduces the potential yield of hybrid seed or the 
potential yield of male-sterile plants during each round 
of seed multiplication for producing parent seed. In 
addition removal of the male- fertile plants may lead to 
irregular stands of the male-sterile plants. For these 
reasons removal of the male-fertile plants is 
economically unattractive for many important crop species 
such as corn and oilseed rape, 

Anthocyanins are pigments that are responsible for 
many of the red and blue colors in plants* The genetic 
basis of anthocyanin biosynthesis has been well 
characterized, particularly in corn, Petunia , and 
Antirrhinium (Dooner et al, 1991, Ann. Rev. Genet . 25:179- 
199; Jayaram and Peterson, 1990, Plant Breeding Reviews 
2:91-137; Coe, 1994, In "The Maize Handbook 1 , Freeling 
and Walbot, eds. Springer Verlag New York Inc., p. 279- 
281). In corn anthocyanin biosynthesis is apparently 
under control of 2 0 or more genes. The structural loci 
C2, Whp, Al, A2, Bzl, and Bz2 code for various enzymes 
involved in anthocyanin biosynthesis and at least 6 
regulatory loci, acting upon the structural genes, have 
been identified in corn i.e. the R, B, CI, PI, P and Vpl 
loci. 

The R locus has turned out to be a gene family (in 
corn located on chromosome 10) comprising at least three 
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different genes i.e.. R (which itself may comprise 
duplicate genes organized in a tandem array) , and the 
displaced duplicate genes R(Sn) and R(Lc) . R typically 
conditions pigmentation of the aleurone but various 
alleles are known to confer distinct patterns of 
pigmentation. R(Lc) is associated with unique 
pigmentation of leaves and R(Sn) with unique pigmentation 
of the scutellar node. One state of R is associated with 
pigmentation of the whole plant (R(P) ) , while another is 
associated with pigmentation of the seeds (R(S) ) . 

Alleles of the unlinked B locus (in corn located on 
chromosome 2) rarely condition pigmentation of the 
aleurone, but are frequently associated with pigmentation 
of mature plant parts. The B-peru allele however, 
pigments the aleurone (like R(S) ) . Analysis at the 
molecular level has confirmed that the R and B loci are 
duplicate genes. 

In order that the R and B loci can color a particular 
tissue, the appropriate allele of CI or PI loci also 
needs to be present. The CI and Cl-S alleles, for 
instance, pigment the aleurone when combined with the 
suitable R or B allele. 

Alleles of the Cl locus have been cloned and 
sequenced. Of particular interest are Cl (Paz-Ares et al, 
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1987, EMBO J. 6:3553-3558) and Cl-S (Schleffer et al, 
1994, Mol. Gen. Genet. 242:40-48). Analysis of the 
sequences revealed the presence of two introns in the 
coding region of the gene. The protein encoded by the CI 
and Cl-S alleles shares homology with mvb proto- oncogenes 
and is known to be a nuclear protein with DNA-binding 
capacity acting as transcriptional activators. 

The cDNA of the B-peru allele has also been analyzed 
and sequenced (Radicella et al, 1991, Plant Mol. Biol. 
17:127-130). Genomic sequences of B-peru were also 
isolated and characterized based on the homology between 
R and B (Chandler et al., 1989, the Plant Cell 1:1175- 
1183; Radicella et al., 1992, Genes & Development 6:2152- 
2164). The tissue specificity of anthocyanin production 
of two different B alleles was shown to be due to 
differences in the promoter and untranslated leader 
sequences (Radicella et al, 1992, supra ) . 

Various alleles of the R gene family have also been 
characterized at the molecular level, e.g. Lc (Ludwig et 
al, 1989, PNAS 86:7092-7096), R-n j , responsible for 
pigmentation of the crown of the kernel (Dellaporta et 
al, 1988, In "Chromosome Structure and Function,: Impact 
of New Concepts, 18th Stadeler Genetics Symposium, 
Gustafson and Appels, eds. (New York, Plenum press, pp. 
263-282)), sn (Consonni ei al, 1992, Nucl. Acids. Res. 
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20:373), and R(S) (Perrot and Cone, 1989, Nucl . Acids, 
Res. 17 : 8003) . 

The proteins encoded by the B and R genes share 
homology with mvc proto-oncogenes and have 
characteristics of transcriptional activators. 

It has been shown that various structural and 
regulatory genes introduced in maize tissues by 
microprojectiles operate in a manner similar to the 
endogenous loci and can complement genotypes which are 
deficient in the introduced genes (Klein et al., 1989, 
PNAS 86:6681-6685; Goff et al., 1990, EMBO J. 9:2517- 
2522), The Lc gene was also used as a visible marker for 
plant transformation (Ludwig et al., 1990, Science 
247:449- 450). Apart from the above other genes involved 
in anthocyanin biosynthesis have been cloned (Cone, 1994, 
In "The Maize Handbook* , Freeling and Walbot eds., 
Springer Verlag New York Inc., p. 282-285). 

In Barley, Falk et al (1981, in Barley Genetics IV, 
proceedings of the 4th International Barley Genetics 
symposium, Edinburgh University press, Edinburgh, pp. 
778-785) have reported the coupling of a male-sterile 
gene to a xenia-expressing shrunken endosperm gene which 
makes it possible to select seeds, before planting, that 
will produce male-sterile plants. Problems asociated with 
such proposal include complete linkage of the two genes 
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(Stoskopf, 1993, Plant Breeding : Theory and Practice, 
Westview Press, Boulder, San Francisco, Oxford). In 
sweetcorn, a genetic system to produce hybrid corn seeds 
without detassling, which utilizes the closely linked 

5 genes y; (white endosperm) and ms (male sterility) was 

suggested but was never used because of contamination 
from 5% recombination. Galinat (1975, J.Hered. 66:387- 
388) described a two-step seed production scheme that 
resolved this problem by using electronic color sorters 

10 to separate yellow from white kernels . This approach has 

not been utilized commercially (Kankis and Davis, 1986, 
in « Breeding Vegetable Crops », the Avi Publishing 
Company Inc . Westport , Connecticut, U.S.A., p. 498). 

EP 0,198,288 and US Patent 4 , 717 , 219 describe methods 

15 for linking marker genes (which can be visible markers or 

dominant conditional markers) to endogenous nuclear loci 
containing nuclear male-sterility genotypes. 

EP 412,911 describes foreign restorer genes (e.g. 
barstar coding region under control of a stamen-specific 

20 promoter) that are linked to marker genes, including 

herbicide resistance genes and genes coding for pigments 
(e.g. the Al gene) under control of a promoter which 
directs expression in specific cells, such as petal 
cells, leaf cells or seed cells, preferably in the outer 

25 layer of the seed. 
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Summary of the Invention 

The invention concerns a maintainer plant consisting 
essentially of cells which comprise in their genome: 

- a homozygous male-sterility genotype at a first genetic 
locus; and 

- a color-linked restorer genotype at a second genetic 
locus, which is heterozygous (Rf/-)for a foreign DNA Rf 
comprising: 

a) a fertility-restorer gene capable of preventing the 
phenotypic expression of said male-sterility 
genotype, and 

b) at least one anthocyanin regulatory gene involved in 
the regulation of anthocyanin biosynthesis in cells 
of seeds of said plant and which is capable of 
producing anthocyanin at least in the seeds of said 
plant, so that anthocyanin production in the seeds 
is visible externally. 

The invention also concerns an anthocyanin regulatory 
gene which is a shortened R, B or CI gene or a 
combination of shortened R, B or CI genes which is 
functional for conditioning and regulating anthocyanin 
production in the aleurone. ] 
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The invention also includes a DNA such as a plasmid 
comprising a fertility-restorer gene capable of 
preventing the phenotypic expression of a male-sterility 
genotype in a plant and at least one anthocyanin 
regulatory gene involved in the regulation of anthocyanin 
biosynthesis in cells of seeds of a plant and which is 
capable of producing anthocyanin at least in the seeds of 
a plant, so that anthocyanin production in the seeds is 
visible externally. 

Also within the scope of the invention is a process to 
maintain a line of male-sterile plants, which comprises 
the following steps: 

i) crossing: 

a) a male-sterile parent plant of said line having, 
in a first genetic locus, a homozygous male- 
sterility genotype, and 

b) a maintainer parent plant of said line consisting 
essentially of cells which comprise, stably 
intergrated in their nuclear genome: 

-• a homozygous male-sterility genotype at a first 
genetic locus; and 

-a colored-linked restorer genotype at a second 
genetic locus, which is heterozygous for a 
foreign DNA comprising: 
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i) a fertility-restorer gene capable of 
preventing the phenotypic expression of 
said male-sterility genotype, and 

ii) at least one anthocyanin regulatory gene 
involved in the regulation of anthocyanin 
biosynthesis in cells of seeds of said 
plant which is capable of producing 
anthocyanin at least in the seeds of said 
plant, so that anthocyanin production in 
the seeds is visible externally, 



ii) obtaining the seeds from said parent plants, and 

iii) separating on the basis of color, the seeds in which 
no anthocyanin is produced and which grow into male- 
sterile parent plants. 



20 



Preferably, the genome of the male-sterile parent plant 
does not contain at least one anthocyanin regulatory gene 
necessary for the regulation of anthocyanin biosynthesis 
in seeds of this plant to produce externally visible 
25 anthocyanin in the seeds, in one embodiment of the 

invention, the genome of the male-sterile parent plant 
contains a first anthocyanin regulatory gene and the 



CONFIRMATION COPY 



WO 95/34634 PCT/EP95/02157 

14 

genome of the maintainer plant a second anthocyanin 
regulatory gene which, when present with the first 
anthocyanin regulatory gene in the genome of a plant, is 
capable of conditioning the production of externally 
visible anthocyanin in seeds. 

The invention also concerns a process to maintain a line 
of maintainer plants, which comprises the following 
steps: 

i) crossing: 

a) a male-sterile parent plant as described 
previously, and 

b) a maintainer parent plant as described 
previously, 

ii) obtaining the seeds from said male-sterile parent 
plant, and 

iii) separating on the basis of color, the seeds in which 
anthocyanin is produced and which grow into 
maintainer parent plants. 



9534634A2J_> 



CONFIRMATION'COPT 



10 



15 



20 



25 



W095/34634 PCT/EP95/02157 

15 

The invention also relates to a kit for maintaining a 
line of male-sterile or maintainer plants, said kit 
comprising: 

a) a male-sterile parent plant of said line as described 
previously, having, in a first genetic locus, a 
homozygous male-sterility genotype and which is 
incapable of producing externally visible anthocyanin 
in seeds, and 



b) 



a maintainer parent plant of said line as described 
previously. 



Also within the scope of the invention is a process to 
maintain the kit described previously which comprises: 

- crossing said male-sterile parent plant with said 
maintainer parent plant; 

- obtaining the seeds from said male-sterile parent 
plants and optionally the seeds from said maintainer 
parent plant in which no anthocyanin is produced; and 

- optionally growing said seeds into male-sterile parent 
plants and maintainer parent plants. 

As mentioned above, the present invention provides 
means to maintain a line of male-sterile plants, 
particularly corn or wheat plants. These means can be in 
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the form of a process which comprises the following 
steps: 

i) crossing A) a first parent plant of said line, which 
is male-sterile, and which is genetically characterized 
by the absence of at least one anthocyanin regulatory 
gene thereby being incapable of producing anthocyanin in 
seeds, particularly in the aleurone layer, and also by 
having at a first genetic locus a homozygous male- 
sterility genotype, and B) a second parent plant of said 
line, which is male-fertile, and which is genetically 
characterized by having at said first genetic locus, said 
homozygous male- sterility genotype, and at a separate 
second genetic locus the genotype Rf/-, 

whereby, 

Rf is a foreign chimeric DNA (the "color-linked 
restorer gene") stably integrated in the nuclear genome 
of said plant which comprises: 

a) a fertility-restorer gene that is capable of 
preventing the phenotypic expression, i.e. the 
male- sterility, of said male-sterility genotype. 

b) said at least one anthocyanin regulatory gene (the 
"color gene") involved in the regulation of the 
anthocyanin biosynthesis in cells of seeds of said 
cereal plant which is capable of producing 
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anthocyanin at least in the seeds, particularly in 
the aleurone, of said cereal plant, 

ii) obtaining the seeds from said first parent plants 

iii) separating, on the basis of color, the seeds in 
which no anthocyanin is produced and in which the 
genotype at said first genetic locus is said homozygous 
male-sterility genotype and the genotype at said second 
genetic locus is -/-, and the seeds in which anthocyanin 
is produced and in which the genotype at said first 
genetic locus is said homozygous male-sterility genotype 
and the genotype at said second genetic locus is Rf/-. 

Of particular interest in the invention is a second 
parent plant in which said at least one anthocyanin 
regulatory gene comprises a gene derived from a genomic 
clone of an R or B gene, particularly an R or B gene that 
conditions anthocyanin production in the aleurone, 
preferably the B-peru allele (e.g. the shortened B-peru 
gene in pCOLl3) , and/ or comprises a gene derived from a 
genomic clone of the CI gene (e.g. the gene with the 
seguence of SEQ ID NO 1 or SEQ ID NO 5) or the Cl-S gene. 

The first genetic locus can be endogenous to plants 
of said line (in which case the homozygous male-sterility 
genotype will be m/m) , but is preferably a foreign locus 
with genotype S/S in which S is a foreign DNA which, when 
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expressed in a plant is capable of rendering the plant 
male-sterile. A preferred foreign DNA comprises at least: 
si) a male-sterility DNA encoding a RNA, protein or 
polypeptide which, when produced or overproduced in a 
cell of the plant, significantly disturbs the 
metabolism, functioning and/or development of the 
cell, and, 

s2)a sterility promoter capable of directing expression 
of the male-sterility DNA selectively in stamen 
cells, preferably tapetum cells, of the plant; the 
male- sterility DNA being in the same transcriptional 
unit as, and under the control of, the sterility 
promoter. 

In case such a foreign male-sterility genotype is used, 
the fertility-restorer gene in the foreign DNA Rf 
preferably comprises at least: 

al) a fertility-restorer DNA encoding a restorer RNA, 
protein or polypeptide which, when produced or 
overproduced in the same stamen cells as said male- 
sterility gene s, prevents the phenotypic expression 
of said foreign male-sterility genotype comprising 
S, and, 

a2) a restorer promoter capable of directing expression 
of the fertility-restorer DNA at least in the same 
stamen cells in which said male-sterility gene S is 
expressed, so that the phenotypic expression of said 
male-sterility gene is prevented; the fertility- 
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restorer DNA being in the same transcriptional unit 
as, and under the control of, the restorer promoter, 
in case of an endogenous male-sterility genotype which is 
homozygous for the recessive male-sterility allele m, the 
fertility restorer gene is preferably a DNA comprising 
the dominant allele M of said locus. 

The present invention >l so provides tne ^ f<> 
chimeric DNA Rf as used ±n ^ ^ ^ 

plasmids comprising these chimeric genes, and host cells 
comprising these plasmids. 

The present invention also provides the shortened B- 

peru gene in pcoL13 (SEQ id NO 61 an n ^ 

v j.u ho 6) and the shortened Cl 

gene, particularly the EcoRI-c^t * 

tcoRi sfai fragment of pC0L9 of 

SEQ ID NO 5. 

The present invention ^ 
-clear gence of which is transform with the foreign 
chimeric DNA Hf, particularly the second parent plant. 

C gtajlad ^rri.n m „. n _ T| 

A male-sterile plant is a plant of a given plant 
species „„ic„ is „ale-sterile fl ue to expression of a 
-le-sterilitv genotype such as a foreign .aie-sterility 
genotype containing a n ale-sterilit y g ene . A restorer 
Plant is a plant of the s. M plant species that contains 
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within its genome at least one fertility-restorer gene 
that is able to restore the male fertility in those 
offspring obtained from a cross between a male-sterile 
plant and a restorer plant and containing both a male- 

5 sterility genotype and a fertility-restorer gene, A 

restored plant is a plant of the same species that is 
male- fertile and that contains within its genome a male- 
sterility genotype and a fertility-restorer gene. 

A line is the progeny of a given individual plant. 

10 A gene as used herein is generally understood to 

comprise at least one coding region coding for an RNA, 
protein or polypeptide which is operably linked to 
suitable promoter and 3' regulatory sequences. A 
structural gene is a gene whose product is a e.g. an 

15 enzyme, a structural protein, tRNA or rRNA. For example 

anthocyanin structural genes encode enzymes (e.g. 
chalcone synthase) directly involved in the biosynthesis 
of anthocyanins in plant cells. A regulatory gene is a 
gene which encodes a regulator protein which regulates 

*0 the transcription of one or more structural genes. For 

example the R, B, and CI genes are regulatory genes that 
regulate transcription of anthocyanin structural genes. 

For the purpose of this invention the expression of a 
gene, such as a chimeric gene, will mean that the 

*5 promoter of the gene directs transcription of a DNA into 

a mRNA which is biologically active i.e. which is either 
capable of interacting with another RNA, or which is 
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capable of being translated into a biologically active 
polypeptide or protein. 

The phenotype is the external appearance of the 
expression (or lack of expression) of a genotype i.e. of 
a gene or set of genes (e.g. male-sterility, seed color, 
presence of protein or RNA in specific plant tissues 
etc . ) 

As used herein, a genetic locus is the position of a 
given gene in the nuclear genome, i.e. in a particular 
chromosome, of a plant. Two loci can be on different 
chromosomes and will segregate independently. Two loci 
can be located on the same chromosome and are then 
generally considered as being linked (unless sufficient 
recombination can occur between them) . 

An endogenous locus is a locus which is naturally 
present in a plant. A foreign locus is a locus which is 
formed in the plant because of the introduction, by means 
of genetic transformation, of a foreign DNA. 

In diploid plants, as in any other diploid organisms, 
two copies of a gene are present at any autosomal locus. 
Any gene can be present in the nuclear genome in several 
variant states designated as alleles. If two identical 
alleles are present at a locus that locus is designated 
as being homozygous, if different alleles are present, 
the locus is designated as being heterozygous. The 
allelic composition of a locus, or a set of loci, is the 
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genotype. Any allele at a locus is generally represented 
by a separate symbol (e.g. M and m, S and - 
representing the absence of the gene) . A foreign locus is 
generally characterized by the presence and/or absence of 
a foreign DNA. A heterozygous genotype in which one 
allele corresponds to the absence of the foreign DNA is 
also designated as hemizygous (e.g. Rf/~) • A dominant 
allele is generally represented by a capital letter and 
is usually associated with the presence of a biologically 
active gene product (e.g. a protein) and an observable 
phenotypic effect (e.g. R indicates the production of an 
active regulator protein and under appropriate conditions 
anthocyanin production in a given tissue while r 
indicates that no active regulator protein is produced 
possibly leading to absence of anthocyanin production). 

A plant can be genetically characterized by 
identification of the allelic state of at least one 
genetic locus . 

The genotype of any given locus can be designated by 
the symbols for the two alleles that are present at the 
locus (e.g. M/m or m/m or S/-) . The genotype of two 
unlinked loci can be represented as a sequence of the 
genotype of each locus (e.g. S/S, Rf/-) 

The nuclear male-sterility genotype as used in this 
invention refers to the genotype of at least one locus, 



CONFIRMATION COPY 



WO 95/34634 PCT/EP95/02157 

23 

preferably only one locus, in the nuclear genome of a 
plant (the "male-sterility locus") the allelic 
composition of which may result in male sterility in the 
plant. A male-sterility locus may be endogenous to the 
plant, but it is generally preferred that it is foreign 
to the plant. 

Foreign male-sterility loci are those in which the 
allele responsible for male sterility is a foreign DNA 
seguence s (the "male-sterility gene") which when 
expressed in cells of the plant make the plant male- 
sterile without otherwise substantially affecting the 
growth and development of the plant. Such male-sterility 
gene preferably comprises at least: 

si) a male-sterility DNA encoding a sterility RNA, 
protein or polypeptide which, when produced or 
overproduced in a stamen cell of the plant, 
significantly disturbs the metabolism, functioning 
and/ or development of the stamen cell, and, 

s2) a sterility promoter capable of directing expression 
of the male-sterility DNA selectively in stamen cells 
(e.g. anther cells or tapetum cells) of the plant; 
the male-sterility DNA being in the same 
transcriptional unit as, and under the control of, 
the sterility promoter. 
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The male-sterility locus preferably also comprises in 
the same genetic locus at least one first marker gene T 
which comprises at least: 

tl) a first marker DNA encoding a first marker RNA, 
protein or polypeptide which, when present at least 
in a specific tissue or specific cells of the plant, 
renders the plant easily separable from other plants 
which do not contain the first marker RNA, protein or 
polypeptide encoded by the first marker DNA at least 
in the specific tissue or specific cells, and, 

t2) a first marker promoter capable of directing 
expression of the first marker DNA at least in the 
specific tissue or specific cells: the first marker 
DNA being in the same transcriptional unit as, and 
under the control of, the first marker promoter • 

Such male-sterility gene is always a dominant allele 
at such a foreign male-sterility locus. The recessive 
allele corresponds to the absence of the male-sterility 
gene in the nuclear genome of the plant. 

Male-sterility DNAs and sterility promoters that can 
be used in the male-sterility genes in the first parent 
line of this invention have been described before (EP 
0,344,029 and EP 0,412,911). For the purpose of this 
invention the expression of the male-sterility gene in a 
plant cell should be able to be inhibited or repressed 
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for instance by means of expression of a suitable 
fertility-restorer gene in the same plant cell. In this 
regard a particular useful male-sterility DNA codes for 
barnase (Hartley, J.Mol. Biol. 1988 202:913). The 
sterility promoter can be any promoter but it should at 
least be active in stamen cells, particularly tapetum 
cells. Particularly useful sterility promoters are 
promoters that are selectively active in stamen cells, 
such as the tape turn-spec if ic promoters of the TA29 gene 
of Nicotiana tabacum (EP 0 , 344 , 029) which can be used in 
tobacco, oilseed rape, lettuce, cichory, corn, rice, 
wheat and other plant species; the PT72, the PT42 and PE1 
promoters from rice which can be used in rice, corn, 
wheat, and other plant species (WO 92/13956) ; the PCA55 
promoter from corn which can be used in corn, rice, wheat 
and other plant species (WO 92/13957); and the A9 
promoter of a tapetum- specific gene of Arabidopsis 
thaliana (Wyatt et al., 1992, Plant Mol. Biol. 19:611- 
922). However, the sterility promoter may also direct 
expression of the sterility DNA in cells outside the 
stamen; particularly if the effect of expression of the 
male-sterility DNA is such that it will specifically 
disturb the metabolism, functioning and/ or development of 
stamen cells so that no viable pollen is produced. One 
example of such a male-sterility DNA is the DNA coding 
for an antisense RNA which is complementary to the mRNA 
of the chalcone synthase gene (van der Meer et al (1992) 
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The Plant Cell 4:253-262). In this respect a useful 
promoter is the 35S promoter (see EP 0,344,029), 
particularly a 35S promoter that is modified to have 
enhanced activity in tapetum cells as described by van 
5 der Meer et al (1992) The Plant Cell 4:253-262 (the "35S- 

tap promoter" ) . 

A preferred endogenous male-sterility locus is one in 
which a recessive allele (hereinafter designated as m) in 

0 homozygous condition (m/m) results in male sterility. At 

such loci male fertility is encoded by a corresponding 
dominant allele (M) . In many plant species such 
endogenous male- sterility loci are known (see Kaul, 
1988, supra (in corn see also recent issues of Maize 

5 Genetics Cooperation Newsletter, published by Department 

of Agronomy and U.S. Department of Agriculture, 
University Of Missouri, Columbia, Missouri, U.S.A.). The 
DNA sequences in the nuclear genome of the plant 
corresponding to m and M alleles can be identified by 

0 gene tagging i.e. by insertional mutagenesis using 

transposons, or by means of T-DNA integration (see e.g. 
Wienand and Saedler , 1987 , In 1 Plant DNA Infectious 
Agents ' , Ed . by T . H . Hohn and J . Schel 1 , Springer Ver lag 
Wien New York, p. 205; Shepherd, 19 88, In 1 Plant 

5 Molecular Biology: a Practical Approach 1 , IRL, Press, p. 

187; Teeri et al . , 1986, EMBO J. 5:1755). It will be 
evident that in the first and second parent plant of this 
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invention S/S can be replaced by m/m without affecting 
the outcome of the process. Indeed, one feature of the 
process of this invention is that the male-sterility 
locus is homozygous thus allowing the use of 'recessive' 
male-sterility alleles. 

Fertility-restorer DNAs that can be used in the 
fertility restorer gene in the second parent line of this 
invention have been described before (EP 0,412,911). 

in this regard, fertility-restorer genes in which the 
fertility-restorer DNA encodes barstar (Hartley, J.Mol. 
Biol. 1988 202:913) are particularly useful to inhibit 
the expression of a male-sterility DNA that encodes 
barnase. In this regard it is believed that a fertility- 
restorer DNA that codes for a mutant of the barstar 
protein, i.e. one in which the Cysteine residue at 
position 40 in the protein is replaced by serine 
(Hartley, 1989, TIBS 14:450), functions better in 
restoring the fertility in the restored plants of some 
species. 

In principle any promoter can be used as a restorer 
promoter in the fertility restorer gene in the second 
parent line of this invention. The only prerequisite is 
that such second parent plant, which contains both the 
color gene and the fertility-restorer gene, should be 
phenotypically normal and male-fertile. This requires 
that the restorer promoter in the fertility-restorer gene 
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should be at least active in those cells of a plant of 
the same species in which the sterility promoter of the 
corresponding male-sterility gene can direct expression 
of the male-sterility DNA. In this regard it will be 
preferred that the sterility promoter and the restorer 
promoter are the same; they can for example be both 
stamen-specific promoters (e.g. the TA29 promoter or the 
CA55 promoter) or they can be both constitutive promoters 
(such as the 35S or 35S-tap promoter) . However, the 
sterility promoter may be active only in stamen cells 
while the restorer promoter is also active in other 
cells. For instance, the sterility promoter can be a 
stamen-specific (such as the TA29 or CA55 promoter) while 
the restorer promoter is the 3 5S-tap promoter. 

When the male sterility to be restored is due to the 
male- sterility genotype at an endogenous male-sterility 
locus being homozygous for a recessive allele m, it is 
preferred that the fertility-restorer gene is the 
dominant allele of that male- sterility locus, preferably 
under control of its own promoter. The DNA corresponding 
to such a dominant allele, including its natural promoter 
can be isolated from the nuclear genome of the plant by 
means of gene tagging as described above. 

The nature of the color gene that is used in the 
color- linked restorer gene in the second parent plant of 
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this invention depends upon the genotype of the 
untransformed plants of the same line. Preferably, only 
cereal plants with a genotype that does not condition 
externally visible anthocyanin production in seeds, 
particularly in the aleurone can be used to produce the 
second parent plants. These plants usually have a 
genotype in which no functional copy of a suitable 
regulatory gene such as the R or B gene, and/or the CI 
gene, is present. 

In corn, for instance, all of the currently used 
inbred lines in the U.S.A. are r-r (pink anthers, leaf 
tips, plant base) or r-g (green) and most of these are cl 
and pi; at the B- locus the B-peru allele is very rare 
(Coe et al, 1988, In -Corn and Corn Improvement • , 3rd 
edition, G.F.Sprague and j.w. Dudley, eds. America 
Science of Agronomy, inc. Publishers, Madison, Wisconsin, 
U.S.A.). The result is that no anthocyanins are produced 
in the aleurone of these lines and that the kernels are 
yellow. This requires that when these li nes are 
transformed with a color-linked restorer gene, the color 
gene should consist of a functional R or B gene which 
conditions anthocyanin production in aleurone, and 
usually also a functional Cl gene capable of conditioning 
anthocyanin production in aleurone. 

A useful R or B gene is the B-peru gene, but of 
course also other R genes could be used such as the R(S) 
gene (Perrot and Cone, 1989, Nucl. Acids Res. 17:8003). 



CONFIRMATION COPY 



WO 95/34634 PCT/EP95/02157 

30 

In this regard a gene derived from genomic clones of the 
B-peru gene (Chandler et al, 1989 , The Plant Cell 1:1175- 
1183) is believed to be particularly useful. However the 
length of this genomic DNA (11 3cbp) renders its practical 
5 manipulation and use for transformation by direct gene 

transfer, difficult, certainly in combination with other 
genes such as the restorer gene and the CI gene. 

In one inventive aspect of this invention it was 
found that the B-peru gene could be considerably 

10 shortened while still retaining, under appropriate 

conditions, its capability of conditioning anthocyanin 
production in the aleurone of seeds of cereal plants such 
as corn. A preferred shortened B-peru gene is that of 
Example 2.2 and which is contained in plasmid pCOL13 

15 (deposited under accession number LMBP 3041). 

A useful CI gene is the genomic clone as described by 
Paz-Ares et al, 1987, EMBO J. 6:3553-3558. However the 
length of this genomic DNA (4 kbp) precludes its 

20 practical manipulation and use for transformation by 

direct gene transfer, certainly in combination with other 
genes such as the restorer gene and the B-peru gene. 
Nevertheless other variants of the CI gene can also be 
used. In this regard Scheffler et al, 1994, 

25 Mol. Gen. Genet. 242:4 0-48 have described the Cl-S allele 

which differs from the CI allele of Paz-Ares et al, supra 
by a few nucleotides in the promoter region near the CAAT 
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box and which is dominant to the wild-type allele (CI) 
and shows enhanced pigmentation. The Cl-S gene can be 
easily used in this invention by appropriate changes in 
the Cl gene. For example the TGCAG at positions 935 to 
939 in SEQ ID NO 1 (respectively at positions 884-888 in 
SEQ ID NO 5) can be easily changed to TTAGG yielding a 
Cl-S allele (respectively pCOL9S) . 

In one inventive aspect of this invention it was 
found that the Cl gene (and the Cl-S gene) could be 
considerably shortened while still retaining, under 
appropriate conditions, its capability of conditioning 
arithocyanin production in the aleurone of seeds of cereal 
plants such as corn. Preferred shortened Cl genes for 
instance are those of Example 2.1 such as comprised in 
PCOL9 which has the sequence of SEQ ID NO 5, particularly 
as comprised between the EcoRI and Sfil sites of pCOL9, 
and the corresponding shortened Cl-s gene in pCOL9S. 

The transcribed region of the shortened B-peru and Cl 
genes still contain some small introns which can also be 
deleted without affecting the function of the genes. It 
is also believed that the shortened B-peru and Cl genes 
can be somewhat further truncated at their 5' and 3« 
ends, without affecting their expression in aleurone. In 
particular it is believed that the sequence between 
positions 1 and 3272 of SEQ ID NO 6 can also be used as a 
suitable B-peru gene. It is also believed that this gene 
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can still be truncated at its 3 1 end down to a position 

between nucleotides 2940 and 3000 of SEQ ID No. 6. 

Although the use of genomic sequences of the B-peru 
gene and the CI gene, particularly the shortened B-peru 
and/or the shortened CI of Cl-S genes, is preferred, 
chimeric R, B, or CI genes can also be used. For instance 
a chimeric gene can be used which comprises the coding 
region (e.g. obtained from the cDNA) of any functional R 
or B gene (i.e. which conditions anthocyanin production 
anywhere in the plant) which is operably linked to the 
promoter region of a R or B gene which conditions 
anthocyanin production in the aleurone (such as R(S) or 
B- peru) . Since the presence of anthocyanin does not 
negatively affect growth, development and functioning of 
plant cells, a constitutive promoter (e.g. the 35S 
promoter), or a promoter which directs expression at 
least in the aleurone can also be used in such a chimeric 
gene, in this regard the promoter of the CI gene can also 
be used to direct expression of a DNA comprising the 
coding region of suitable R or B gene, particularly the 
B-peru grene. 

Similarly the coding region (e.g. obtained from cDNA) 
of the CI gene can be operably linked to the promoter of 
a gene that directs expression at least in the aleurone. 
In this regard, the promoter of the B-peru gene can also 
be used to direct expression of a DNA comprising the 



INSDOCID: <WO 9534634A2J_> 



CONFIRMATION COPY 



10 



15 



.20 



25 



WO 95/34634 PCT/EP95/02157 

33 

coding region of a suitable CI gene such as that of the 
Cl gene of SEQ ID No. 1 or of the Cl-S gene. 

In another inventive aspect of the invention it was 
found that the the promoters comprised in DNAs 
characterized by the sequences between positions l to 
1077, particularly between positions 447 and 1077, quite 
particularly between positions 447 and 1061 of SEQ ID NO 
1, between positions 396 and 1026 of SEQ ID NO 5, and 
between positions 1 to 575, particularly between position 
1 to 188 of SEQ ID NO 6 are promoters that predominantly, 
if not selectively, direct expression of any DNA, 
preferably a heterologous DNA in the aleurone layer of 
the seeds of plants. 

Of course in those lines in which a functional Cl 
gene is already present in the genome the color gene can 
consist only of a suitable functional R or B gene (or a 
chimeric alternative) . Alternatively if a line contains 
already a functional R or B gene which can condition 
anthocyanin production in the aleurone, but no functional 
Cl gene, only a functional Cl gene is required as a color 
gene. 

It is believed that the color genes of this invention 
are especially useful in cereal plants , and that they are 
of particular use in corn and wheat, and certainly in 
corn . 
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For the purposes of this invention it is preferred 
that, in the second parent plants the "Rf" locus and the 
male- sterility (e.g. "S") locus are not linked and 
segregate separately. 

In the second parent plant f the fertility restorer 
gene, the B-peru gene and the CI gene are preferably 
closely linked. This can of course be achieved by 
introducing these genes in the nuclear genome of the 
plants as a single transforming foreign DNA (the Rf DNA) 
thus forming a foreign Rf locus. Alternatively, the 
fertility restorer gene and the color gene can be 
separately introduced by cotransf ormation which usually 
results in single locus insertions in the plant genome. 

The color-linked restorer gene Rf as used in the 
second parent plant preferably also comprises at least 
c) a second marker gene which comprises at least: 

cl) a second marker DNA encoding a second marker RNA, 
protein or polypeptide which, when present at least in a 
specific tissue or specific cells of the plant, renders 
the plant easily separable from other plants which do not 
contain the second marker RNA, protein or polypeptide 
encoded by the second marker DNA at least in the specific 
tissue or specific cells, and, 
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c2) a second marker promoter capable of directing 
expression of the second marker DNA at least in the 
specific tissue or specific cells: the second marker DNA 
being in the same transcriptional unit as, and under the 
control of, the second marker promoter. 

First and second marker DNAs and first and second 
marker promoters that can be used in the first and second 
marker genes of this invention are also well known (EP 
0,344,029; EP 0,412,911). In this regard it is preferred 
that the first and second marker DNA are different, 
although the first and second marker promoter may be the 
same. 

Foreign DNA such as the fertility-restorer gene, the 
foreign male-sterility gene, the B-peru and the CI genes, 
or the first or second marker gene preferably also are 
provided with suitable 3 ■ transcription regulation 
sequences and polyadenylation signals, downstream (i.e. 
3') from their coding sequence i.e. respectively the 
fertility-restorer DNA, the male-sterility DNA, the 
coding region of a color gene (such as a B-peru gene 
and/or a CI gene) or the first or second marker DNA. In 
this regard either foreign or endogenous transcription 3' 
end formation and polyadenylation signals suitable for 
obtaining expression of the chimeric gene can be used. 
For example, the foreign 3- untranslated ends of genes, 
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such as gene 7 (Velten and Schell (1985) Nucl. Acids Res, 
13:6998) f the octopine synthase gene (De Greve et al., 
1982, J.Mol. Appl. Genet. 1:499; Gielen et al (1983) EMBO 
J. 3:835; Ingelbrecht et al. f 1989, The Plant Cell 1:671) 
and the nopaline synthase gene of the T-DNA region of 
Aarobacterium tumef aciens Ti-plasmid (De Picker et al. , 
1982, J.Mol. Appl. Genet. 1:561), or the chalcon synthase 
gene (Sommer and Saedler, 1986, Mol . Gen. Genet. 202:429- 
434), or the CaMV 19S/35S transcription unit (Mogen et 
al., 1990, The Plant Cell 2:1261-1272) can be used. 
However, it is preferred that the color genes in this 
invention carry their endogenous transcription 3 1 end 
formation and polyadenylation signals. 

The fertility-restorer gene, the male-sterility gene, 
the color gene or the first or second marker gene in 
accordance with the present invention are generally 
foreign DNAs, preferably foreign chimeric DNA. In this 
regard "foreign" and "chimeric" with regard to such DNAs 
have the same meanings as described in EP 0,344,029 and 
EP 0,412,911. 

The cell of a plant, particularly a plant capable of 
being infected with Aarobacterium such as most 
dicotyledonous plants (e.g. Brass ica napus ) and some 
monocotyledonous plants, can be transformed using a 
vector that is a disarmed Ti-plasmid containing the male- 
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sterility gene, the color linked restorer gene or both 
and carried by AarobacteriviTn. This transformation can be 
carried out using the procedures described, for example, 
in EP 0,116,718 and EP 0,270,822. Preferred Ti-plasmid 
vectors contain the foreign DNA between the border 
sequences, or at least located to the left of the right 
border sequence, of the T-DNA of the Ti-plasmid. Of 
course, other types of vectors can be used to transform 
the plant cell, using procedures such as direct gene 
transfer (as described, for example, in EP 0,233,247), 
pollen mediated transformation (as described, for 
example, in EP 0,270,356, PCT patent publication "WO" 
85/01856, and US patent 4,684,611), plant RNA virus- 
mediated transformation (as described, for example, in EP 
0,067,553 and US patent 4,407,956) and liposome-roediated 
transformation (as described, for example, in US patent 
4,536,475). Cells of monocotyledonous plants such as the 
major cereals including corn, rice, wheat, barley, and 
rye, can be transformed (e.g. by electroporation) using 
wounded or enzyme-degraded intact tissues capable of 
forming compact embryogenic callus (such as immature 
embryos in corn), or the embryogenic callus (such as type 
I callus in corn) obtained thereof, as described in WO 
92/09696. In case the plant to be transformed is corn, 
other recently developed methods can also be used such 
as, for example, the method described for certain lines 
of corn by Fromm et al., 1990, Bio/Technology 8:833; 



NSDOC1D: <WO flfW4ftfWA9 i ^ 



CONFIRMATION COPY 



WO 95/34634 PCT/EP95/02157 

38 

Gordon-Kamm et al., 1990, Bio/Technology 2:603 and Gould 
et al., 1991, Plant Physiol. 95:426. In case the plant to 
be transformed is rice f recently developed methods can 
also be used such as, for example, the method described 

5 for certain lines of rice by Shimamoto et al., 1989, 

Nature 3 38:274; Datta et al., 1990, Bio/Technology 8:736; 
and Hayashimoto et al., 1990, Plant Physiol. 93:857. 

The transformed cell can be regenerated into a mature 
plant and the resulting transformed plant can be used in 

10 a conventional breeding scheme to produce more 

transformed plants with the same characteristics or to 
introduce the male- sterility gene, the color-linked 
restorer gene (or both) , in other varieties of the same 
related plant species. Seeds obtained from the 

15 transformed plants contain the chimeric gene(s) of this 

invention as a stable genomic insert. Thus the male- 
sterility gene, or the color-linked restorer gene of this 
invention when introduced into a particular line of a 
plant species can always be introduced into any other 

20 line by backcrossing. 

The first parent plant of this invention contains the 
male-sterility gene as a stable insert in its nuclear 
genome (i.e. it is a male-sterile plant) . For the 
25 purposes of this invention it is preferred that the first 

parent plant contains the male-sterility gene in 
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homozygous condition so that it transmits the gene to all 
of its progeny. 

The second parent plant of this invention contains 
the male-sterility gene and the color-linked restorer 
gene as stable inserts in its nuclear genome (i.e. it is 
a restored plant). It is preferred that the male- 
sterility gene be in homozygous condition so that the 
second parent plant transmits the gene to all of its 
progeny and that the color-linked restorer gene be in 
heterozygous condition so that the second parent plant 
transmits the gene to only half of its progeny. 

It is preferred that the first and second parent 
plants are produced from the same untransf ormed line of a 
plant species, particularly from the same inbred line of 
that species. 

The first and second parent plants of this invention 
have the particular advantage that seeds of such plants 
can be maintained indefinitely, and can be amplified to 
any desired amount (e.g. by continuous crossing of the 
two plant lines) . 

The color genes of this invention can be used as 
marker gene in any situation in which it is worthwhile to 
detect the presence of a foreign DNA (i.e. a transgene) 
in seeds of a transformed plant in order to isolate seeds 
which possess the foreign DNA. In this regard virtually 
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any foreign DNA, particularly a chimeric gene can be 
linked to the color gene. 

Examples of such foreign DNAs are genes coding for 
5 insecticidal (e.g. from Bacillus thurincriensis) . 

fungicidal or nematocidal proteins. Similarly the color- 
gene can be linked to a foreign DNA which is the male- 
sterility gene as used in this invention. 

However, the color genes are believed to be of 
10 particular use in the process of this invention in which 

they are present in a foreign DNA which comprises a 
fertility restorer gene (such as the barstar gene of 
Bacillus amvloliauefaciens l under control of a stamen- 
specific promoter (such as PTA29) . In appropriate 
15 conditions the use of the color genes allows the easy 

separation of harvested seeds that will grow into male- 
sterile plants, and harvested seeds that will grow into 
male-fertile plants. In this regard the seeds are 
preferably harvested from male-sterile plants (the first 
20 parent plants) that are homozygous at a male-sterility 

locus (such as a locus comprising the barnase gene under 
control of PTA29) and which have been pollinated by 
restorer plants (the second parent plants of this 
invention) which contain in their genome two unlinked 
25 gene loci one of which comprises the same male- sterility 

locus which is homozygous for the same male-sterility 
gene while the other is a foreign locus which comprises 



CONFIRMATION COPY 



WO 95/34634 PCT7EP95/02157 

41 

an appropriate fertility restorer gene (i.e. whose 
expression will counteract the expression of the male- 
sterility gene) and also the color gene of this 
invention, particularly an R or B gene that is expressed 
in the aleurone and/or a CI gene, preferably the B-peru 
and Cl gene (e.g. as described in the examples) . First 
and second parent plants can be essentially produced as 
described in the examples and as summarized in Figure 1. 
In step 8 of Figure 1 it is demonstrated that the 
crossing of the first and second parent plants of this 
invention will give rise in the progeny to about 50% new 
first parent (i.e. male- sterile) plants and about 50% 
new second parent (i.e. male- fertile) plants and that 
these two types of plants can already be separated at the 
seed stage on the basis of color. Red kernels will grow 
into male-fertile plants while yellow kernels will grow 
into male-sterile plants. 

Thus a line of male-sterile first parent plants of 
this invention can be easily maintained by continued 
crossing with the second parent plants of this invention 
with, in each generation, harvesting the seeds from the 
male-sterile plants and separation of the yellow and red 
kernels. Of course in this way any desired amount of seed 
for foundation seed production of a particular line, such 
as an inbred line, can also be easily obtained. 

The red and yellow seeds harvested from a cereal 
plant (e.g. the first parent plant of this invention) can 
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be separated manually. However, such separation can also 
be effected mechanically- A color sorting machine for 
corn kernel and other granular products is for instance 
available from Xeltron U.S. (Redmond, Washington, U.S.A. ) 

5 

Unless otherwise indicated all experimental 
procedures for manipulating recombinant DNA were carried 
out by the standardized procedures described in Sambrook 
et al., 1989, "Molecular Cloning: a Laboratory Manual", 
10 Cold Spring Harbor Laboratory, and Ausubel et al, 1994, 

"Current Protocols in Molecular Biology" , John Wiley & 
Sons. 

The polymerase chain reactions ("PCR") were used to 
clone and/or amplify DNA fragments. PCR with overlap 
15 extension was used in order to construct chimeric genes 

(Horton et al, 1989, Gene 77:61-68; Ho et al, 1989, Gene 
77:51-59) . 

All PCR reactions were performed under conventional 
conditions using the Vent T M polymerase (Cat. No. 254L - 

20 Biolabs New England, Beverley, MA 01915, U.S.A.) isolated 

from Thermococcus litoralis (Neuner et al., 1990, 
Arch. Microbiol. 153:205-207). Oligonucleotides were 
designed according to known rules as outlined for example 
by Kramer and Fritz (1968, Methods in Enzymology 

25 154:350), and synthesized by the phosphoramidite method 

(Beaucage and Caruthers, 1981, Tetrahedron Letters 
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22:1859) on an applied Biosystems 380A DNA synthesizer 

(Applied Biosystems B.V. , Maarssen, Netherlands) . 

In the following examples, reference will be made to 
5 the following sequence listing and figures: 

Sequence Listing 

SEQ ID NO 1 : sequence of CI gene 

10 SEQ ID NO 2 : plasmid P TS256 

SEQ ID NO 3 : EcoRI-HindHI region of P TS200 

comprising the chimeric gene 
PCA55-barstar-3 'nos (the omitted 
region of pTS200 is derived from 
PUC19. 

SEQ ID NO 4 : oligonucleotide 1 

: pCOL9 containing the shortened CI 

gene as a EcoRl-sfil fragment 
: presumed sequence of the EcoRI- 
Hindlli region of pCOL13 
containing the shortened B-peru 
gene (the rest of the plasmid is 
PUC19). The stretch of n 
nucleotides corresponds to a 
region of approximate length 
which is derived from the genomic 
clone of the B-peru gene but for 
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which the sequence needs to be 
confirmed. 

actual sequence of the EcoRl- 
Hindlll region of pC0L13 
containing the shortened B-peru 
gene (the rest of the plasmid is 
pUC19 ) . 

Figures 

10 

Figure 1 : Breeding schetme to obtain the first and 

second parent plants of this invention 

Figure 2 : Schematic structure of pCOL25, pC0Ii26, 

pCOL27, pCOL28 # pCOLlOO and pDEHO. 

15 

Examples 

Example 1 ; Construction of ola smids containing the male- 
sterility gene comprising the TA29 promoter and the 

20 barnase coding region 

Plasmids useful for transformation of corn plants and 
carrying a male-sterility gene and a selectable marker 
gene have been described in WO 92/09696 and WO 92/00275. 
Plasmid pVE107 contains the following chimeric genes: 

25 l) PTA29-barnase-3 f nos, i.e. a DNA coding for barnase of 

Bacillus amvloliauef aciens ( barnase ) operably 1 inked to 
the stamen-specif ic promoter of the TA29 gene of 
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Nicotiana £abacuin (PTA29) and the 3» regulatory sequence 
containing the polyadenylation signal of the nopaline 
synthase gene of Agrobacterium tn ffi pf a ri one (3'nos), and 
2) P35S-neo-3'ocs, i.e. the coding region of the gene of 
Tn5 of E.coli coding for neomycin phosphotransferase 
(neo) operably linked to the 35S promoter of Cauliflower 
Mosaic Virus (P35S) and the 3' regulatory sequence 
containing the polyadenylation signal of the octopine 
synthase gene of Aorobact ^ium tn^f.Hon. (3»ocs). 

Plasmid pVE108 contains the following chimeric genes: 
1) r»TA29-barnase-3«nos / and 2) P35S-bar-3 'nos, i.e. the 
gene of Streptomvces hvcn-ngcop i gu « (EP 242236) coding for 
phosphinothricin acetyl transferase (bar) operably linked 
to the P35S and 3«nos. 

PTA29-b jym ase-3'nos is an example of a foreign chimeric 
male-sterility gene (S) used in this invention. 

Example 2 : Construction of a nlasmi* ^~ inirtn ^ 
color-linked res torer nmo 

2.1. Obtaining a shortened f unctj nrmi. C l g^ng 

The Cl gene of maize was cloned from transposable- 
induced mutants and its sequence was reported (Paz-Ares, 
1987, EMBO J. 6:3553-3558). This sequence is reproduced 
in SEQ ID NO. 1. Plasmid p36 (alternatively designated as 
pClLCSkb and further designated as plasmid P XX03 6) 
comprising a Cl genomic clone was obtained from Dr. H. 
Saedler and Dr. U. Wienand of the Max- Planck Institut 



>lSDOCID: <WO 95346a4A? 1 ^ 



CONFIRMATION COPY 



WO 95/34634 PCT/EP95/02157 

46 

fur Zuchtungsf orschung, Koln, Germany. pXX036 was 
digested with SnabI and Hindlll, filled-in with Klenow, 
and selfligated, yielding plasmid pCOL9. pCOL9 
corresponds to pUC19 (Yanisch-Perron et al, 1985 , Gene 
33:103-119) which contains, between its EcoRI and 
modified Hindlll sites, the 2189 bp EcoRI-SnabI fragment 
(corresponding to the sequence between positions 448 and 
2637 of SEQ ID NO 1) Of pXX036. 

pXX036 was also digested with Sfil and Hindlll and 
treated with Klenow to make blunt ends. After ligation 
the plasmid in which the DNA downstream from the Sfil 
site was deleted was designated as pCOL12. 

The sequence TGCAG in pC0L9, corresponding to the 
sequence at positions 884 to 888 in SEQ ID NO 5, is 
changed to TTAGG, yielding pC0L9S which instead of a 
shortened CI gene contains a shortened overexpressing Cl- 
S gene (Schleffer et al, 1994, Mol. Gen. Genet. 242:40-48). 
A similar change is introduced in pCOL12, yielding 
PC0L12S. 

2.2 . Obtaining a shortened functional B-peru gene 

Plasmid pBP2 (further designated as pXX004) is 
plasmid pT218U (Mead et al., 1986, Protein Engineering 
1:67; U.S. Biochemical Corp.) containing the genomic clone 
of the B- peru gene. Plasmid p35SBPcDNA (further 
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designated as pXX002) is plasmid pMP6 (Goff et al, 1990, 
EMBO J. 9:2517-2522) containing the cDNA corresponding to 
the B-peru gene. Both plasmids were obtained from Dr. V. 
Chandler of the University of Oregon, Oregon, U.S.A. A 
2660 bp sequence of the genomic clone around the 
translation initiation codon was reported 

(EMBL/Genbank/DDBJ databases; locus name ZMBPERUA, 
Accession number X70791; see also Radicella et al, 1992, 
Genes & Development 6:2152-2164). The sequence of the B- 
peru cDNA was also reported (Radicella et al, 1991, Plant 
Mol. Biol. 17:127- 130). 

Substantial amounts of 5» and 3' flanking sequences 
were deleted from pXX004, and the MluI-MunI fragment in 
the coding region of the genomic clone was replaced by 
the 1615 bp Mlul- Muni fragment of the cDNA clone. The 
resulting plasmid was designated as pcOL13 which was 
deposited at the Belgian Coordinated Collection of 
Microorganisms - lmbp Collection, Laboratory Molecular 
Biology, University of Ghent, K.L. Ledeganckstraat 35, B- 
9000 Ghent, Belgium and was given the Accession Number 
LMBP 3041. A shortened but functional B-peru gene is 
contained in pC0L13 as an EcoRl-Sall fragment with an 
approximate length of 4 kbp (see SEQ ID NO 6) . 

2.3. Combining -t- he C3 and B-nern genes 

The CI gene in pCOL9 and the B-peru gene in pCOL13 
were then combined as follows. The 4 kbp EcoRI-Sall 



NSDOCID: <WO asJUfirwAi) I ^ 



CONFIRMATION COPY 



WO 95/34634 PCT/EP95/02157 

48 

fragment of pC0L13 was introduced between the EcoRI and 
Sail sites of the vector pBluescript II SK(-) 
(Stratagene) , yielding #7 B SK(-) . pCOL9 was digested 
with Sfil, treated with Klenow to fill in protruding 

5 ends, and further digested with EcoRI . The 1978 bp 

Sfil (Klenow) /EcoRI was then introduced between the EcoRI 
and Smal sites of #7 B SK(-) , yielding #7 B+C SK(-). 
Finally the Xhol site in the CI sequence was removed as 
follows. The 950 bp EcoRI-SacII fragment of #7 B SK(-) 

10 (EcoRI site corresponding to the EcoRI site at position 

1506 in SEQ ID NO 1; the SacII site from the pBluescript 
linker) was introduced between the EcoRI and SacII sites 
of the Phagescript Vector (Stratagene) to yield pC0L21. 
Single strands of pC0L21 were prepared and hybridized to 

15 the following synthetic oligonucleotide 1 (SEQ ID No. 4): 

5 f -CGT TTC TCG AAT CCG ACG AGG-3 1 
resulting in a silent change ( CTCGAG -> CTCGAA) and 
removal of the Xhol site . 

The 710 Aatll-SacII fragment of #7 B SK(-) was then 
20 exchanged for the corresponding Aatll-SacII fragment of 

the mutated PCOL21, yielding pCOL23. 

pCOL23 was then linearized with SacII , treated with 
Klenow # and ligated to Xhol linker sequence (Stratagene) , 
yielding pC0L24* 



25 



Using the same procedure as described above, the 
shortened Cl-S gene of pCOL9S is combined with the 
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shortened B-peru gene of pCOL23, yielding plasmid 
PCOL24S. 
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U4^_ Constr uction of vector^ ^ ro ri^mt «. ho C1 am , 
Tenes as well as Ml^m^ g en e, .and a 
marker asno 

PTS256 is derived from P uci9 and contains the 
following two chimeric genes :i) P35s.b3r-3.nos, and 2) 
PTA29-barstar- 3-nos, i.e. a DNA coding for barstar of 
Bacillus mnYlolicroef aniens (barsiar or bar*) operably 
linked to PTA29 and 3'nos. The complete sequence of 
PTS256 is given in SEQ ID NO 2. 

PTS200 is derived from pU C19 and contains the 
following two chimeric genes : i, P35S-bar-3 -nos, and 2) 
PCASS-fea^r- 3-nos, i.e. barstar operably linked to the 
stamen-specific promoter PCA55 of Z^_mays and 3'nos. The 
complete sequence of pTSZOO is given in SEQ ID MO 3. 

PTS256 was modified by the inclusion of Notl linkers 
(Stratagene) in both the unique Sspl and Smal sites, 
yielding P TS256NN. The shorter BspEI-Sacli fragment^ 
PTS256NN was then replaced by the shorter BspEi-SacXI 
fragment of pTS200, yielding pTS256+200. 

pTS256NN contains P35S-bar3 1 -nos and pTA29-barstar 3 'nos 
on a Notl cassette. P TS256NN + 200 contains P35S-bar3 '-nos 
and PCA55- barstars'nos on a Notl cassette. 

The Notl cassette of pTS256NN was introduced in the 
Notl site of PC0L24, yielding pC0L25 and P C0L26 which 
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differ with respect to the orientation of the PSSS-barS 1 - 
nos gene with respect to the shortened CI gene (Figure 
2) . 

The NotI cassette of pTS256NN+200 was introduced in 
5 the NotI site of pCOL24, yielding pC0L27 and pCOL28 which 

differ with respect to the orientation of the P35S- bar 3 
nos gene with respect to the shortened CI gene (Figure 
2) . 

Plasmids pCOL25, pCOL26, pCOL27 or pCOL28 contain a 
10 color- linked restorer gene Rf and a selectable marker 

gene (P35S- bar - 3 'nos). Rf comprises the shortened CI and 
B-peru genes and a chimeric barstar gene (either PTA2 9- 
barstar-3'nos or PCA55- barstar-3 'nos) . 

15 Plasmids pCOL25S, pCOL26S, pCOL27S or pC0L28S, 

containing the shortened Cl-S gene instead of the 

shortened CI gene, are obtained in a similar way using 
pC0L24S instead of pCOL24. 

20 2*5. Construction of vectors comprising the CI and B-peru 

genes as well as male-sterility gene 

Plasmid pTS59 can be obtained from plasmid pTS256 (of SEQ 
ID NO 2) by replacing the fragment extending from 
positions 1 to 1470 (comprising the chimeric gene P35S- 
25 bar-3»nos) with the sequence TATGATA. Then NotI linkers 

(Stratagene) were introduced in the EcoRV and Smal sites 
of pTS59 ; yielding pTS59NN. Finally the NotI fragment 
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comprising the chimeric gene PTA29-barstar-3 'nos was 
introduced in the NotI site of #7 B+C SK(-) , yielding 
PCOL100 (the general structure of pCOLlOO and pDEHO is 
also presented in Figure 2) . 

2.6. Expression of shortened Cl and R- p e ni in a \ eU rone in 
corn seeds 

Dry seeds were incubated overnight in water at room 
temperature and were then peeled and sliced in half. Pour 
to six half kernels were placed with the cut side on wet 
filter paper and were bombarded with tungsten particles 
(diameter 0.7 jim) which were coated with DNA. 
Particle bombardment was essentially carried out using 
the particle gun and procedures as described by Zumbrunn 
et al, 1989, Technique, 1:204-216. The tissue was placed 
at 10 cm from the stopping plate while a 100 nm mesh was 
placed at 5 cm from the stopping plate. 

- DNA of the following plasmids was used : 

- PXX002 : B-peru cDNA under control of the 35S promoter 

- PXX201 : Cl cDNA under control of the 35S promoter 

- PC0L13 : shortened B-peru gene as described in Example 2.2 

- PC0L12 : shortened Cl gene as described in Example 2.1 

- PCOL100 : shortened B-peru and shortened Cl and PTA29- 
barstar-3 ' nos as described in Example 2.5. 
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After bombardment the tissue was incubated for 2 days on 
wet filter paper at 27 'C and was then checked for the 
presence of red spots indicating anthocyanin production. 
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Note : + indicates that anthocyanin production was 



observed in at least one experiment; - indicates that no 
anthocyanin production was observed, nt = not tested. 

The results for three public lines (H99 f Pa91, B73) 
and 9 different, commercially important, proprietary 
inbred lines from various sources are shown in Table 1. 
The line c-ruq is a tester line which is homozygous for a 
Cl allele that is inactivated by insertion of a receptor 
for the regulator Uq (Cormack et al., 1988, Crop Sci. 
28:941-944), 
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All lines which were r and cl produced anthocyanin in 
the aleurone after introduction with both a functional B- 
peru and Cl gene. Lines which were R and cl produced 
anthocyanin upon introduction of a functional Cl gene. 
Lines which were r and Cl produced anthocyanin upon 
introduction of a functional B-peru gene. This proves 
that the B-peru and cl gene are sufficient for 
anthocyanin production in most com lines. From the data 
in Table 1 it is also evident that even the shortened B- 
peru and Cl genes are still functional and are capable of 
producing anthocyanin in aleurone of corn lines with 
suitable genotypes. 

Example 3 — t Production of first parent com plants h Y 
transformation of corn with the plasm-Ms of example 1. 

Corn plants of line H99, transformed with a male- 
sterility gene comprising a DNA encoding barnase of 
Bacillus amvloliouefaciens under control of the promoter 
of the TA29 gene of Nicotiana tabacnm have been 
described in WO 92/09696. The transformed plants were 
shown to be male-sterile. 

Example 4 : Production o f second parent com plants *y 
transformation of com wit h the pi asmids of examples •> . 

Corn inbred lines H99 and Pa9l are transformed using 
the procedures as described in WO 92/09 696 but using 
plasmids pCOL25, pCOL26, pCOL27 or pC0L28 of Example 2. 
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Regenerated plants are selected that are male fertile and 
in which the shortened CI, the shortened B-peru gene, the 
P35S-bar-3 'nos gene, and the PTA29- barstar-3 "nos (or 
PCA5 5 -barstar-3 'nos) are expressed. 

Alternatively the male-sterile plants of Example 3 
(already containing the S gene) can be transformed with 
plasmids pC0L25, pCOL26, pCOL27 or pCOL28 of Example 2 on 
the condition that the S and Rf genes are linked to 
different selectable marker genes. 

Similarly, transformed corn plants are obtained using 
plasmids pC0L25S, pC0L26S, pCOL27S or pCOL28S of Example 
2. 

In an alternative set of experiments the second 
parent plants of this invention were obtained by 
transforming corn plants of line H99, Pa91, and 
(Pa91xH99)x H99 with two separate plasmids one of which 
contained the color linked restorer gene (pCOLlOO) , while 
the other contains an appropriate selectable marker gene 
such as a chimeric bar gene (pDEHO) (alternatively a 
chimeric neo gene may also be used) . pDEHO was described 
in WO 92/09696 and the construction of pCOLlOO was 
described in Example 2.5. 

In yet another set of experiments the second parent 
plants of this invention are obtained by transforming 
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com plants with a purified fragment of the plasmids of 
example 2.4. Such purified fragment is obtained by 
digestion of the plasmids of example 2.4 with Xhol and 
subsequent purification using conventional procedures 
such as gel filtration. 

Untransformed corn plants of lines H99 or Pa91 are 
detasseled and pollinated with pollen of the plants 
transformed with the Rf DNA. It is observed that the f 
gene segregates in a Mendel ian way and that the seed that 
is harvested from these plants is colored and non-colored 
(yellow) in a 1:1 ratio. The red color of the seeds is 
correlated with the presence of the Rf gene. 

Example 5 ; The production of thg f irst anri second pai-owi- 
plants of this invp n i-m n 

First parent plants and second parent plants (i.e. 
maintainer plants) according to the invention are 
produced along the lines set out in Figure 1. 

The male-sterile plants of step 1 are those produced 
in Example 1. The corn plants transformed with the 
color-linked restorer gene of step 2 are those produced 
in Example 4. 

A plant of Example 1 and a plant of Example 4 are 
crossed (Step 3) and the progeny plants with the genotype 
S/-, Rf/- are selected (Step 4), e.g. by demonstrating 
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the presence of both the S and Rf genes in the nuclear 
genome (e.g. by means of PCR) . 

The plants selected in Step 4 are then crossed with 
the male-sterile plants with genotype S/- (Step 5) . The 
colored seeds (i.e. those containing the Rf gene) are 
selected, grown into plants, and examined for the 
presence of both the S and Rf genes (e.g. by PCR) . The 
plants containing both the S and Rf genes are selfed and 
the seeds of each plant are examined on seed color (red 
or yellow) . From the progeny of the self ings the non- 
colored seeds are grown into plants (step 6) . The progeny 
of the self ings in which all noncolored seeds grow into 
male-sterile plants are retained (Step 6) . These male- 
sterile plants are all homozygous for the S gene and are 
crossed with their fertile siblings (of genotype 
S/S,Rf/Rf or S/S,Rf/-) (Step 7). For some crossings the 
seeds harvested from the male-sterile plants are 50% 
colored and 50% non-colored (step 7) . The colored seeds 
all grow into fertile corn plants of genotype S/S, Rf/- 
which are the maintainer plants, or the second parent 
plants, of the present invention. The noncolored seeds 
all grow into male-sterile plants of the genotype S/S,-/- 
which are the first parent plants of this invention (Step 
7 > • 

The first and second parent plants are crossed and 
the seeds harvested from the male-sterile plants are 
separated on the basis of color (Step 8). All colored 
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seeds grow again in second parent plants and all 
noncolored seeds grow in first parent plants, thereby 
establishing an easy maintenance of a pure male-sterile 
line of corn. 

5 

If the plant DNA that is flanking the S gene in the 
plants of Example 1 has been characterized, the progeny 
of the cross in Step 5 with genotypes S/S,-/- and S/S,R/- 
can be easily identified by means of PCR using probes 
10 corresponding to the flanking plant DNA. In this way Step 

6 can be skipped because the plants of Step 5 which grow 
from colored seeds (genotype S/S,Rf/-) can be crossed 
directly to plants with genotype s/S,-/- (as in Step 7). 

15 All publications cited in this application are hereby 

incorporated by reference. 

Example 6 : Maintainer plan ts containing a color-Knir^ 
restorer gene comprising the B-P eru coding region nnHor 

20 control of the promoter of the Cl-S gene. 

Using conventional techniques a chimeric gene is inserted 
between the EcoRI and Hindlll sites of the polylinker of 
plasmid pUC19. The chimeric gene comprises the following 
elements in sequence: 

25 i) the promoter region of the Cl-S gene, i.e. the DNA 

fragment with the sequence of SEQ ID No. l from 
nucleotide positions 447 up to 1076 but containing at 
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nucleotide positions 935-939 the sequence TTAGG instead 
of TGCAG . 

ii) a single C nucleotide 

iii) the coding region and 3 'untranslated region of 
the B-peru gene, i.e. the DNA fragment with the sequence 
of SEQ ID No. 7 from nucleotide positions 576 up to 4137. 

This plasmid (designated as pLH52) , together with plasmid 
PC0L9S of Example 2 (comprising a Cl-S gene) and pTS256 
of SEQ ID No. 2 (comprising the following chimeric genes: 
P35S-bar-3 'nos and PTA29- barstar -3 «nos) f is used to 
transform corn essentially as described in Example 4. The 
transformed plants are then used to obtain second parent 
plants as described in Example 5. 

Example 7: Maintainer plant s containing a color-linked 
restorer gene comprising the B -Peru coding region under 
control of the 35S promoter. 

Using conventional techniques a chimeric gene is inserted 
between the EcoRI and Hindlll sites of the polylinker of 
plasmid pUC19. the chimeric gene comprises the following 
elements in sequence: 

i) The promoter region of the 35S promoter , i.e. the 
DNA fragment of pDEHO which essentially has the sequence 
as described in SEQ ID No. 4 of WO 92/09696 (which is 
incorporated herein by reference) from nucleotide 
positions 396 up to 1779 
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ii) the coding region and 3 'untranslated region of 
the B-peru gene, i.e. the DNA fragment with the sequence 
of SEQ ID No. 7 from nucleotide positions 576 up to 4137. 

This plasxuid (designated as pP35S-Bp) , together with 
plasmid pC0L9S of Example 2 (comprising a Cl-S gene) and 
PTS256 of SEQ ID No! 2 (comprising the following "chimeric 
genes: P35S-bar-3 «nos and PTA-29-barstar-3 'no's) , is used 
to transform com essentially as described in Example 4. 
The transformed plants are then used to obtain second 
parent plants as described in Example 5. 

Alternatively plasmid p3 5SBperu as described in Goff et 
al, 1990, EMBO 9:2517-2522 is used instead of pP35SBp. 

Example 8: Ma intainer plants c o ntaining a col n r -l inked 
restorer gene comprising t h e maize y gene ending region 
under the control of the promoter of the c.\-s gene. 
Using conventional techniques a chimeric gene is inserted 
in the EcoRl site of the polylinker of plasmid pUC19. The 
chimeric gene comprises the following elements in 
sequence: 

i) the promoter region of the Cl-S gene, i.e. the DNA 
fragment with the sequence of SEQ ID No. 1 for nucleotide 
positions 447 up to 1076 but containing at nucleotide 
positions 93 5-939 the sequence TTAGG instead of TGCAG ; 
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ii) a single C nucleotide; 

iii) a DNA sequence comprising the coding region and 
3' end untranslated region of the maize P gene as 
described by Grotewold et al in 1991, PNAS 88:4587-4591 
(nucleotides 320-1517). The maize P gene is an 
anthocyanin regulatory gene which specifies red 
phlobaphene pigmentation, a flavonoid pigment involved in 
the biosynthetic pathway of anthocyanin. In fact, the 
protein encoded by the P gene activates, among others, 
the Al gene required for both anthocyanin and phlobaphene 
pigmentation. Two cDNA clones have been isolated and 
sequenced by Grotewold et al and are described in the 
publication referred to above. It is the longer cDNA 
which is of particular interest for construction of this 
chimeric gene. However, alternatively, the coding region 
of the shorter transcript can also be used in this 
chimeric gene, as well as the P gene leader sequence 
instead of the CI-S gene leader sequence. The P gene does 
not require a functional R or B gene to produce 
pigmentation. The visible pigment that is produced in the 
seeds of the maintainer plants is phlobaphene, a 
flavonoid pigment (like anthocyanin) directly involved in 
anthocyanin biosynthesis. 

iv) a DNA fragment containing the polyadenylation 
signal of the nopal ine synthase gene of Aarobacterium 
tumefaciens, i.e. the DNA fragment with the sequence of 
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SEQ ID. No. 2 from nucleotide position 1600 up to 
nucleotide position 2909. 

The resulting plasmid (designated as pPCSl-P) ,. together 
with pTS256 of SEQ ID No. 2 is used to transform com 
essentially as described in example 4. The transformed 
plants are then used to obtain second parent plants as 
described in example 5. 

Example 9; Maintainer plants con taining a color-linVoH 
restorer gene comprising the B-pem coding r P gi on OT n P r 
• the control of the B-peru promoter . 
Using conventional techniques a chimeric gene is inserted 
between the EcoRl and the Hindlll sites of the polylinker 
of plasmid pUCl9. The chimeric gene comprises the 
following elements in sequence: 

i) the promoter of the B-peru gene, i.e. a 1952 bp 
DNA sequence as disclosed in the EMBL databank under 
accession number X70791; 

it) the coding region and 3 'untranslated region of 
the B-peru gene, i.e. the DNA fragment with the sequence 
of SEQ ID No. 7 from nucleotide position 576 up to 4137. 
This plasmid (designated aspCOLll) , together with plasmid 
pCOL 9S of example 2 (comprising a Cl-S gene) and pTS256 
of seq ID No. 2 (comprising the following chimeric genes: 
P35S-bar-3>nos and PTA29-barstar-3 'nos) is used to 
transform corn essentially as described in example 4. The 
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•transformed plants are then used to obtain second parent 
plants as described in example 5. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: PLANT GENETIC SYSTEMS N.V. 

(B) STREET: Jozef Plateaustraat 22 

(C) CITY: Ghent 

(E) COUNTRY: Belgium 

(F) POSTAL CODE (ZIP) : 9000 

(G) TELEPHONE: 32 9 235 84 11 

(H) TELEFAX: 32 9 224 06 94 

(I) TELEX: 11.361 Pgsgen 

(ii) TITLE OF INVENTION: Use of anthocyanin genes to maintain 
male-sterile plants 

(iii) NUMBER OF SEQUENCES: 7 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/254,776 

(B) FILING DATE: 06-JUN-1994 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: CI gene of Zea mays 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION:279. .284 

(D) OTHER INFORMATION :/label= Hpal 

(ix) FEATURE: 

(A) NAME/ KEY : - 

(B) LOCATION:447. .452 

(D) OTHER INFORMATION :/label= EcoRI 

(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1735. .174 0 

(D) OTHER INFORMATION: /label- Aatll 

(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 1505. .1510 

(D) OTHER INFORMATION : /label= EcoRI 
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gene' 



(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 2081. .2086 

(D) OTHER INFORMATION :/label= Xhol 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 2418. .2430 

(D) OTHER INFORMATION : /label= Sfil 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 2669. .2674 

(D) OTHER INFORMATION :/label= SnaBI 

(ix) FEATURE: . 

(A) NAME/ KEY: - 

(B) LOCATION: 2634. .2639 

(D) OTHER INFORMATION: /label= SnaBI 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 3008. .3013 

(D) OTHER INFORMATION :/label= Hpal 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION :1. .1077 

<D) OTHER INFORMATION: /label= PCI 

/note= "region containing promoter of CI gene" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1078. .2134 

(D) OTHER INFORMATION: /label= CI 

/note- "coding region of CI gene" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 2135. .2430 

(D) OTHER INFORMATION: /label= 3 ' CI 

/note= "region containing polyadenylation signal of CI 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1033. .1038 

(D) OTHER INFORMATION: /label= TATA-Box 

(ix) FEATURE: 

(A) NAME/ KEY : - 

(B) LOCATION: 1061. .1062 

(D) OTHER INFORMATION: /label= transcript-ini t 
/note= "transcription initiation site" 

(ix) FEATURE: 

(A) NAME/ KEY: intron . 

(B) LOCATION: 1211. . 1299 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION : 14 30. . 157 5 
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(ix) FEATURE: 

(A) NAME/ KEY : - 

(B) LOCATION : 935 • . 939 

(D) OTHER INFORMATION: /label= Cl-S 

/note= "TGCAG sequence (in CI gene) which in the Cl-S 
sequence is changed to TTAGG " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



TATCAACCTC 


CTGTGTTATT 


TTTAGTGACG 


GTTTCTTAAA 


AAACACCACT 


AGAAATCGTA 


60 


TTTTTATAGG 


TGGTTCCTTA 


AGAAAACTGC 


AT GCAGAAAT 


CCATGACGGT 


TTTCTTAAGG 


120 


AACCGTATGT 


AG AAAT AC G A 


TTTCTAGTGA 


CGATCTTCTT 


AAGGAAACCA 


CCACTAAAAA 


180 


TTATTTTTAT 


CCTTAATTTT 


CGAGTTTTTC 


AAACGATCTC 


GTATGATGAA 


ACCATCAAAA 


240 


TAAAAGTTGT 


ACATCTCTAA 


AAGTTATGAA 


AATTTGTAGT 


TAACAACTTT 


TTTATTT GAA 


300 


CTCATTTTGG 


TTCTCAAAAA 


TTGCATCTAA 


ATTTGTCAAA 


TTTAAAATTC 


AAATTTTCCA 


360 


AACGACCTCG 


GATGAAAAAA 


GTGTCAAAAT 


GAAAGTTGTA 


GAACTTCAAA 


AGTTATTCAA 


420 


CTTTGTAGTC 


GACTATCTTT 


TTATTTGAAT 


TCGCTTACGG 


TCTCAAACAA 


GCAATTTACA 


480 


CTCAGTTGGT 


TGTAATATGT 


GGACAATAAA 


ACTACAAACT 


AGACACAAAT 


CATACCATAG 


540 


ACGGAGTGGT 


AGCAGAGGGT 


ACGCGCGAGG 


GTGAGATAGA 


GGATTCTCCT 


AAAATAAATG 


600 


CACTTTAGAT 


GGGTAGGGTG 


GGGTGAGGCC 


TCTCCTAAAA 


TGAAACTCGT 


TTAATGTTTC 


660 


TAAAAATAGT 


TTTCACTGGT 


GATCCTTAGT 


TACTGGCATG 


TAAAAATGAT 


GATTT CTACT 


720 


GTCTCTCATA 


TGGACGGTTA 


TAAAAAATAC 


CATTATATTG 


AAAATAGGTC 


TCTGCTGCTA 


780 


CACTCGCCCT 


CATAGCAGAT 


CATGCATGCA 


CGCATCATTC 


GATCAGTTTT 


CGTTCTGATG 


840 


CAGTTTT CGA 


TAAATGCCAA 


TTTTTTAACT 


GCATACGTTG 


CCCTTGCTCA 


GCACCAGCAC 


900 


AGCAGTGTCG 


TGTCGTCCAT 


GCATGCACTT 


TAGGTGCAGT 


GCAGGGCCTC 


AACTCGGCCA 


960 


CGTAGTTAGC 


GCCACTGCTA 


CAGATCGAGG 


CACCGGTCAG 


CCGGCCACGC 


ACGTCGACCG 


1020 


CGCGCGTGCA 


TTTAAATACG 


CCGACGACGG 


AGCTTGATCG 


ACGAGAGAGC 


GAGCGCGATG 


1080 


GGGAGGAGGG 


CGTGTTGCGC 


GAAGGAAGGC 


GTTAAGAGAG 


GGGCGTGGAC 


GAGCAAGGAG 


1140 


GACGATGCCT 


TGGCCGCCTA 


CGTCAAGGCC 


CATGGCGAAG 


GCAAATGGAG 


GGAAGTGCCC 


1200 


CAGAAAGCCG 


GTAAAACTAG 


CTAGTCTTTT 


TATTTCATTT 


TGGGATCATA 


TAT AT AC C C C 


1260 


CGAGGCAAGA 


CCGGAGGACG 


ATCACGTGTG 


TGGGTGCAGG 


TTTGCGTCGG 


TGCGGCAAGA 


1320 


GCTGCCGGCT 


GCGGTGGCTG 


AACTACCTCC 


GGCCCAACAT 


CAGGCGCGGC 


AACAT CTCCT 


1380 


ACGAC GAGGA 


GGATCTCATC 


ATCCGCCTCC 


ACAGGCTCCT 


CGGCAACAGG 


TCTGTGCAGT 


1440 


GGCCAGTGGT 


GGGCTAGCTT 


ATTACACGAG 


CTGACGACGA 


GGCGATCGAT 


CGAGCGTCTG 


1500 


CTGCGAATTC 


ATCTGTTCCG 


GTGTCGGCCG 


TGTGAGAGTG 


AGCTCATTCA 


TATGTACATG 


1560 
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CGTGTTGGCG CGCAGGTGGT CGCTGATTGC AGGCAGGCTG CCTGGCCGAA CAGACAATGA 1620 

AAT CAAGAAC TACTGGAACA GCACGCTGGG CCGGAGGGCA GGCGCCGGCG CCGGCGCCGG 1680 

CGGCAGCTGG GTCGTCGTCG CGCCGGACAC CGGCTCGCAC GCCACCCCGG CCGCGACGTC 1740 

GGGCGCCTGC GAGACCGGCC AGAATAGCGC CGCTCATCGC GCGGACCCCG ACTCAGCCGG 18 00 

GACGACGACG ACCTCGGCGG CGGCGGTGTG GGCGCCCAAG GCCGTGCGGT GCAGGGGCGG 18 60 

ACTCTTCTTC TTCCACCGGG ACACGACGCC GGCGCACGCG GGCGAGACGG CGACGCCAAT 1920 

GGCCGGTGGA GGTGGAGGAG GAGGAGGAGA AGCAGGGTCG TCGGACGACT GCAGCTCGGC I960 

GGCGTCGGTA TCGCTTCGCG TCGGAAGCCA CGACGAGCCG TGCTTCTCCG GCGACGGTGA 2040 

CGGCGACTGG ATGGACGACG TGAGGGCCCT GGCGTCGTTT CTCGAGTCCG ACGAGGACTG 2100 

GCTCCGCTGT CAGACGGCCG GGCAGCTTGC GTAGACAACA AGTACACGTA TAGAT GT CCA 2160 

ATAAGCACGA GGCCCGCGAG CCCGGCACGA AGCCCGCTTT TTGGGCCCGG TCCGAGCCCG 2220 

GCACGGCCCG GTTATATGCA GACCCGGGCC GGCCCGGCAC GAATAAG CGG GCCGGGCTCG 2280 

GACAGGAAAT TAGGCACGGT GAGCTAGCCC GGCACGGCCC GTTTAGGTCT AAGCCCGTTA 2340 

AGCCCGTTTT TTTACACTAA AACGTGCTTC TCGGCCCGCA TAGCCCGCTT CTCGGCCCGC 2400 

TTTTTTCGTG CTAAACGGGC CGGCCCGGCC CGGTTTAGGC CCGTTGCGGG CCGGGCTCGG 2460 

ACAGGAAATT GAGCCCGCGT GCTTAGCCGG CCCGGCCCGG TTTTTTAATC GTGCCTGGCG 2520 

GGCCAGGCCC AAAACGGGCC GGGCTTCACC GGGCCCGGGC CGGACCGGGC CGGGCGGCCC 2580 

GTTTGGACAT CTCTAAGTAC AC GT AT GGAG GAGAATATAT ATATAGTCAT GCGTACGTAT 2640 

AGATTTTTTC ATCCGATCCC AACAGAAATA C GT ATGAAAA TGCTCTTCGT TCTTTTTCAT 2700 

TTATCATATC TATACTATAC TTAAAAC AC C AGTTTCAACG GTCGTCATGC GTCATTTTTT 2760 

TACAAATAAC CCCTCACAGC TATTT CAAAT TAATCCGCTG CACGT CTAT A GAT GCCAAAC 2820 

GACGCCCAAC ACGGGCTAGA TGCACGCGGG C C ACAACT AT GGCACAGG C A CGTCATGCCG 2880 

GCCTGCTAAC TGTGTCGGGC TAGCCCGTTA GCCCGTCGAT CCATTTAATT AAATTAGCGT 2940 

AACGACGCCC GACACGGGCT AGAT G CACGT GGGCCACAAC TATGGCACAT G CACGT CAT G 3000 

CCGGCCTGTT AACTGTGTCG GGCCAGTCTG TTAGCCCATT GATCCATTTA ATTAAATCAG 3060 

CGTAAAATGT TAAAAACGGT GCAGGAGGTG GGGTTCGAAC CCATACCCTG ATGGAAGAAG 3120 

GGCGGGAGAC ACTGGGTGAA ACT GT CT AAC CAGTAGAATA TCTATCACGC TAAGATGTTT 3180 

TTAATATTGA ATATAAATTG TAT ATAAG C A TATAAGTTTT TTTGTAAAAT AAAAAATAAT 324 0 

CGTGTCGGGC CGGGCCATCA CTACTGGCCG AGGCTACAAC CCAAGCACGA CACGACGTTC 3300 

TTGGCTCTTG CAAGCATTAG GTCGTTTCTG AGACCATATT GGCGCAATGG ACTACATGAT 3360 

-GTTTGGGGTT GCTGAATTGA ATGGAGCAGC AATAATTT GT CACACTAACA GCAAAATGAA 3420 

AGGTTATTTG TTGGTTTTAA ACGTTAGTAA TTGCTACGAA GTAGCATAAT TT AT AT GGAG 3480 
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CGCATCCAGT TTTTATTGAT GCCTGACTTT AGCAATCACT CCATATTTTG ATCTATCTTT 354 0 

TTTATAAGTT TGACTTCATG GGACTTATTT TAGAACTTGA TCTCACAAAC TTTCTCTTAT 3600 

TTTGTCTCTA TATGAT GAAA TTGTGTCATT TTATAATCTT TGTTCATTCA GTCAATCGTT 3660 

GTGAACTCTC TTCTAATCAC TCACTTCATT AGTTGTGTTG TACCAAGACA TATTTGCATA 372 0 

GAGTAAACAA TAACATCAGT TAGCCAAATC AAAAAATATA TTATACAGAG AGCGGAGACA 3780 

ATCAAATAAA AAATCTTGAA ATTTTTTTAA TGGATAGTTT ACGTGGGTAT TGTTGTAAGC 384 0 

CGTCGCAACG CACGGGCAAC CGACTAGTTT TAGTTTATAA ATTAATAAAC GTACGACAAA 3900 

TATTAAGAAC GCCACCTTTC CATGCCTACG CGCGCGTGAG AC AC G AC C G G GGCACGTCAG 3960 

ACGTGTGCCC CTGTTGTATA ATTTATTTAC TTTTTAATGA CTATGTGCTG TTGGTTGCCG 4 020 

TTGGCTTCAT CGTGTTCGTA GCCATGCATA AATCCAGCG ' 4 059 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4896 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: plasmid pTS256, linearized at Hindlll 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION complement (39. . 317) 
(D) OTHER INFORMATION :/label= 3'nos 

S ±nr^i ^ "/note- "3' regulatory sequence containing the polyadenylation 

signal of the nopalme synthase gene of Agrobacterium T-DNA" 

(ix) FEATURE: 

(A) NAME/ KEY : - 

(B) LOCATION : complement (318.. 869) 
(D) OTHER INFORMATION: /label= bar 

/note= "coding region of bar gene of Streptomyces hygroscopicus " 

(ix) FEATURE: 

(A) NAME/ KEY : - 

(B) LOCATION: complement (870. .1702) 
(D) OTHER INFORMATION: /label- P35S 

/note= "35S promoter of Cauliflower Mosaic Virus" 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 1740. .2284 

(D) OTHER INFORMATION :/label= PTA29 

/note= "promoter of TA29 gene of Nicotiana tabacum" 
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(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 22 85 . .2557 

(D) OTHER INFORMATION :/label= barstar 

/note= "coding region of barstar gene of Bacillusamyloliquef acien 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION; 2558 . .2879 

(D) OTHER INFORMATION :/label= 3'nos 

/note= "3 1 regulatory sequence containing the polyadenylation 
signal of the nopaline synthase gene of Agrobacterium T-DNA" 

(ix) FEATURE: 

(A) NAME/KEY: - 
<B) LOCATION: 1 38 

(D) OTHER INFORMATION :/label= pUC19 

/note= "pUCl 9 derived sequence" ; 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION:2880. .4896 

(D) OTHER INFORMATION :/label= pUC19 
/note= "pUC19 derived sequence" 

(ix) FEATURE: 

(A) NAME/ KEY : - 

(B) LOCATION: 3004. .3009 

(D) OTHER INFORMATION :/label= EcoRI 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



AGCTTGCATG 


CCTGCAGGTC 


GACTCTAGAG 


GATCTTCCCG 


AT CTAGTAAC 


ATAGATGACA 


60 


CCGCGCGCGA 


TAATTTATCC 


TAGTTTGCGC 


GCTATATTTT 


GTTTTCTATC 


GCGTATTAAA 


120 


TGTATAATTG 


CGGGACTCTA 


AT C AT AAAAA 


CCCATCTCAT 


AAATAACGTC 


ATGCATTACA 


180 


TGTTAATTAT 


TACATGCTTA 


ACGTAATTCA 


ACAGAAATTA 


TAT GATAATC 


ATCGCAAGAC 


240 


CGGCAACAGG 


ATTCAATCTT 


AAGAAACTTT 


ATTGCCAAAT 


GTTTGAACGA 


TCTGCTTCGG 


300 


AT C CTAGACG 


CGTGAGATCA 


GATCTCGGTG 


ACGGGCAGGA 


CCGGACGGGG 


CGGTACCGGC 


360 


AGGCTGAAGT 


CCAGCTGCCA 


GAAACCCACG 


TCATGCCAGT 


TCCCGTGCTT 


GAAGCCGGCC 


420 


GCCCGCAGCA 


TGCCGCGGGG 


GGCATATCCG 


AGCGCCTCGT 


GCATGCGCAC 


GCTCGGGTCG 


480 


TTGGGCAGCC 


CGATGACAGC 


GACCACGCTC 


TTGAAGCCCT 


GTGCCTCCAG 


GGACTTCAGC 


540 


AGGTGGGTGT 


AGAGCGTGGA 


GCCCAGTCCC 


GTCCGCTGGT 


GGCGGGGGGA 


GACGTACACG 


600 


GTCGACTCGG 


CCGTCCAGTC 


GTAGGCGTTG 


CGTGCCTTCC 


AGGGGCCCGC 


GTAGGCGATG 


660 


CCGGCGACCT 


CGCCGTCCAC 


CTCGGCGACG 


AGCCAGGGAT 


AGCGCTCCCG 


CAGACGGACG 


720 


AGGTCGTCCG 


TCCACTCCTG 


CGGTTCCTGC 


GGCTCGGTAC 


GGAAGTTGAC 


CGTGCTTGTC 


780 


TCGATGTAGT 


GGTTGACGAT 


GGTGCAGACC 


GCCGGCATGT 


CCGCCTCGGT 


GGCACGGCGG 


840 


ATGTCGGCCG 


GGCGTCGTTC 


TGGGTCCATG 


GTTATAGAGA 


GAGAGATAGA 


TTTATAGAGA 


900 
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GAGACTGGTG 


ATTTCAGCGT 


GTCCTCTCCA AATGAAATGA ACTTCCTTAT 


ATAGAGGAAG 


960 


GGTCTTGCGA 


AGGATAGTGG 


GATTGTGCGT 


CATCCCTTAC 


GTCAGTGGAG 


AT GT CACATC 


1020 


AATCCACTTG 


CTTTGAAGAC 


GTGGTTGGAA 


CGTGTTCTTT 


TTCCACGATG 


CTCCTCGTGG 


1080 


GTGGGGGTCC 


ATCTTTGGGA 


CCACTGTCGG 


CAGAGGCATC 


TTGAATGATA 


GCCTTTCCTT 


1140 


TATCGCAATG 


ATGGCATTTG 


TAGGAGCCAC 


CTTCCTTTTC 


TACTGTCCTT 


TCGATGAAGT 


1200 


GACAGATAGC 


TGGGCAATGG 


AATCCGAGGA 


GGTTTCCCGA 


AATTATCCTT 


TGTTGAAAAG 


1260 


TCTCAATAGC 


CCTTTGGTCT 


TCTGAGACTG 


TAT CTTTGAC 


ATTTTTGGAG 


TAGACCAGAG 


1320 


TGTCGTGCTC 


CACCATGTTG 


ACGAAGATTT TCTTCTTGTC ATTGAGTCGT 


AAAAGACTCT 


1380 


GT AT GAACT G 


TTCGCCAGTC 


TTCACGGCGA 


GTTCTGTTAG 


ATCCTCGATT 


TGAAT CTTAG 


1440 


ACTCCATGCA 


TGGCCTTAGA 


TTCAGTAGGA ACTACCTTTT 


TAGAGACTCC 


AATCT CTATT 


1500 


ACTTGCCTTG 


GTTTATGAAG 


CAAGCCTTGA 


ATCGTCCATA 


CTGGAATAGT 


ACTTCTGATC 


1560 


TTGAGAAATA 


TGTCTTTCTC 


TGTGTTCTTG 


ATGCAATTAG 


TCCTGAATCT 


TTTGACTGCA 


1620 


TCTTTAACCT 


TCTTGGGAAG 


GTATTTGATC 


TCCTGGAGAT 


TGTTACTCGG 


GTAGATCGTC 


1680 


TTGATGAGAC 


CTGCTGCGTA 


GGAGCTTGCA 


TGCCTGCAGG 


TCGACTCTAG 


AGGATCCCCA 


1740 


TCTAGCTAAG 


TATAACTGGA 


TAATTTGCAT 


TAACAGATTG 


AATATAGTGC 


CAAACAAGAA 


1800 


GGGACAATTG 


ACTTGTCACT 


TTATGAAAGA 


TGATTCAAAC 


ATGATTTTTT 


AT GTACTAAT 


1860 


AT ATACAT CC 


TACTCGAATT 


AAAGC GACAT 


AGGCTCGAAG 


T AT GCACATT 


TAGCAATGTA 


1920 


AATTAAATCA 


GTTTTTGAAT 


CAAGCTAAAA 


GCAGACTTGC 


ATAAGGTGGG 


TGGCTGGACT 


1980 


AGAATAAACA 


TCTTCTCTAG 


CACAGCTTCA 


TAATGTAATT 


T CCATAACT G 


AAATCAGGGT 


2040 


GAGACAAAAT 


TTTGGTACTT 


TTTCCTCACA 


CTAAGTCCAT 


GTTTGCAACA 


AATTAATACA 


2100 


TGAAACCTTA 


ATGTTACCCT 


CAGATTAGCC 


TGCTACTCCC 


CATTTTCCTC 


GAAATGCTCC 


2160 


AACAAAAGTT 


AGTTTTGCAA 


GTTGTT GTGT 


ATGTCTTGTG 


CTCTATATAT 


GCCCTTGTGG 


2220 


TGCAAGTGTA 


ACAGTACAAC 


ATCATCACTC 


AAATCAAAGT 


TTTTACTTAA 


AGAAATTAGC 


2280 


T AC CAT G AAA 


AAAGCAGTCA 


TTAACGGGGA 


ACAAATCAGA AGTATCAGCG 


ACCTCCACCA 


2340 


GACATT GAAA 


AAGGAGCTTG 


CCCTTCCGGA 


ATACTACGGT 


GAAAACCTGG 


ACGCTTTATG 


2400 


GGATTGTCTG 


ACCGGATGGG 


TGGAGTACCC 


GCTCGTTTTG 


GAATGGAGGC 


AGTTT GAACA 


24 60 


AAGCAAGCAG 


CTGACTGAAA 


ATGGCGCCGA 


GAGTGTGCTT 


CAGGTTTTCC 


GTGAAGCGAA 


2520 


AGCGGAAGGC 


TGCGACATCA 


CCATCATACT 


TTCTTAATAC 


GATCAATGGG 


AGAT GAACAA 


2580 


TAT GGAAAC A 


CAAACCCGCA 


AGCTTGGTCT 


AGAGGATCCG 


AAGCAGATCG 


TTCAAACATT 


2640 


TGGCAATAAA 


GTTTCTTAAG 


ATTGAATCCT 


GTTGCCGGTC 


TTGCGATGAT 


TATCATATAA 


2700 


TTTCTGTTGA 


ATTACGTTAA 


G CAT GT AAT A 


ATTAACATGT 


AATGCATGAC 


GTTATTTATG 


2760 
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AGATGGGTTT TT AT GATTAG AGTCCCGCAA TTATACATTT AATACGCGAT AGAAAACAAA 2 820 

ATATAGCGCG CAAACTAGGA TAAATTATCG CGCGCGGTGT CATCTATGTT ACTAGATCGG 2 880 

GAAGATCCCC GGGTACCGAG CTCGAATTCT GATCAGGCCA ACGCGCGGGG AGAGGCGGTT 2940 

TGCGTATTGG GCGCTCTTCC GCTTCCTCGC TCACTGACTC GCTGCGCTCG GTCGTTCGGC 3000 

TGCGGCGAGC GGTATCAGCT CACTCAAAGG CGGTAATACG GTTAT CCACA GAAT CAGGGG 3060 

ATAACGCAGG AAAGAACATG TGAGCAAAAG GCCAGCAAAA GGCCAGGAAC CGTAAAAAGG 3120 

CCGCGTTGCT GGCGTTTTTC CATAGGCTCC GCCCCCCTGA CGAGCATCAC AAAAATCGAC 3180 

GCTCAA<3TCA GAGGTGGCGA AACC C GACAG GACTATAAAG AT AC CAGGC G TTTCCCCCTG 324 0 

GAAGCTCCCT CGTGCGCTCT CCTGTTCCGA CCCTGCCGCT TACCGGATAC CTGTCCGCCT 3300 

TTCTCCCTTC GGGAAGCGTG GCGCTTTCTC AATGCTCACG CTGTAGGTAT CTCAGTTCGG 3360 

TGTAGGTCGT TCGCTCCAAG CTGGGCTGTG TGCACGAACC CCCCGTTCAG CCCGACCGCT 3420 

GCGCCTTATC CGGTAACTAT CGTCTTGAGT CCAACCCGGT AAGACACGAC TTATCGCCAC 34 80 

TGGCAGCAGC CACTGGTAAC AGGATT AG CA GAGCGAGGTA TGTAGGCGGT GCTACAGAGT 354 0 

TCTTGAAGTG GTGGCCTAAC TACGGCTACA CTAGAAGGAC AGTATTTGGT ATCTGCGCTC 3600 

TGCTGAAGCC AGTTACCTTC GGAAAAAGAG TTGGTAGCTC TTGATCCGGC AAACAAACCA 3 660 

CCGCTGGTAG CGGTGGTTTT TTTGTTTGCA AGCAGCAGAT TACGCGCAGA AAAAAAGGAT 3720 

CTCAAGAAGA TCCTTTGATC TTTTCTACGG GGTCTGACGC TCAGT GGAAC GAAAACTCAC 3780 

GTTAAGGGAT TTTGGTCATG AGACTC GAGC CAAAAAGGAT CTTCACCTAG ATCCTTTTAA 384 0 

ATTAAAAATG AAGTTTTAAA TCAATCTAAA GTATATATGA GTAAACTTGG TCTGACAGTT 3900 

ACCAATGCTT AATCAGTGAG GCACCTATCT CAGCGATCTG TCTATTTCGT TCATCCATAG 3960 

TTGCCTGACT CCCCGTCGTG TAGATAACTA CGATACGGGA GGGCTTACCA TCTGGCCCCA 4 020 

GTGCTGCAAT GATACCGCGA GACCCACGCT CACCGGCTCC AGATTTATCA GCAATAAACC 4 080 

AGCCAGCCGG AAGGGCCGAG CGCAGAAGTG GTCCTGCAAC TTTATCCGCC TCCATCCAGT 414 0 

CTATTAATTG TTGCCGGGAA GCTAGAGTAA GTAGTTCGCC AGTTAATAGT TTGCGCAACG 4200 

TTGTTGCCAT TGCTACAGGC ATCGTGGTGT CACGCTCGTC GTTTGGTATG GCTTCATTCA 4260 

GCTCCGGTTC C CAACGAT CA AGGCGAGTTA CATGATCCCC CATGTTGTGC AAAAAAGCGG 4 320 

TTAGCTCCTT CGGTCCTCCG ATCGTTGTCA GAAGTAAGTT GGCCGCAGTG TTATCACTCA 4 380 

TGGTTATGGC AGCACTGCAT AATTCTCTTA CTGTCATGCC ATCCGTAAGA TGCTTTTCTG 444 0 

TGACTGGTGA GTACTCAACC AAGTCATTCT GAGAATAGTG TATGCGGCGA CCGAGTTGCT 4 500 

CTTGCCCGGC GTCAATACGG GATAATACCG CGCCACATAG CAGAACTTTA AAAGTGCTCA 4 560 

TCATTGGAAA ACGTTCTTCG GGGCGAAAAC TCTCAAGGAT CTTACCGCTG TTGAGATCCA 4 62 0 
GTTCGATGTA ACCCACTCGT GCACCCAACT GATCTTCAGC ATCTTTTACT TTCACCAGCG 4 680 
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TTTCTGGGTG AGCAAAAACA GGAAGGCAAA ATGCCGCAAA AAAGGGAATA AGGGCGACAC 4740 

GGAAATGTTG AATACT CATA CTCTTCCTTT TTCAATATTA TTGAAGCATT TATCAGGGTT 4 800 

ATTGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA AAATAAACAA ATAGGGGTTC 4 860 

CGCGCACATT TCCCCGAAAA GTGCCACCTG ACGTCA 4896 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3544 base pairs ' 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: EcoRI-Hindl II region of plasmid pTS200 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 3227 . . 3504 

(D) OTHER INFORMATION: /label= 3'nos 

/note= M 3' regulatory sequence containing the polyadenylation 
signal of the nopaline synthase gene of Agrobacterium T-DNA" 

(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 2675. .3226 

(D) OTHER INFORMATION: /label= bar 

/note= "coding region of bar gene of Streptomyces hygroscopicus " 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 1841. .2674 

(D) OTHER INFORMATION: /label= P35S 

/note= "35S promoter of Cauliflower Mosaic Virus" 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION complement (626.. 1803) 
(D) OTHER INFORMATION: /label= PCA55 

/note= "promoter of CA55 gene of Zea mays" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: complement (353.. 625) 

(D) OTHER INFORMATION :/label= barstar 

/note= "coding region of barstar gene of Bacillus 
amyloliquefaciens" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION : complement (30. .352) 
(D) OTHER INFORMATION :/label= 3 1 nos 

/note= "3* regulatory sequence containing the polyadenylation 
signal of the nopaline synthase gene of Agrobacterium T-DNA" 
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(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 1. . 6 

(D) OTHER INFORMATION; /labels EcoRI 

(ix) FEATURE: 

(A) NAME/ KEY: - 
,( B) LOCATION: 3539. .354 4 
(D) OTHER INFORMATION: /label= Hindi I I 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



GAATTCGAGC 


TCGGTACCCG 


GGGATCTTCC 


CGATCTAGTA . 


ACATAGATGA 


CACCGCGCGC 


60 


GATAATTTAT 


CCTAGTTTGC 


GCGCTATATT 


TTGTTTTCTA 


TCGCGTATTA 


AATGTATAAT 


12 0 


TGCGGGACTC 


T AAT CAT AAA 


AACCCATCTC 


ATAAATAACG 


T CAT GCATT A 


CATGTTAATT 


180 


ATTACATGCT 


TAACGTAATT 


CAACAGAAAT 


TAT AT GAT AA 


TCATCGCAAG 


ACCGGCAACA 


240 


GGATTCAATC 


TTAAGAAACT 


TTATTGCCAA 


AT GTTT GAAC 


GATCTGCTTC 


GGATCCTCTA 


300 


GACCAAGCTT 


GCGGGTTTGT 


GTTT CCATAT 


TGTTCATCTC 


CCATTGATCG 


TATTAAGAAA 


360 


GTATGATGGT 


GATGTCGCAG 


CCTTCCGCTT 


TCGCTTCACG 


GAAAAC CT GA AGCACACTCT 


420 


CGGCGCCATT 


TTCAGTCAGC 


TGCTTGCTTT 


GTTCAAACTG 


CCTCCATTCC 


AAAACGAGCG 


480 


GGTACTCCAC 


CCATCCGGTC 


AGAGAATCCC 


ATAAAGCGTC 


CAGGTTTTCA 


CCGTAGTATT 


54 0 


CCGGAAGGGC 


AAGCTCCTTT 


TTCAATGTCT 


GGTGGAGGTC 


GCTGATACTT 


CTGATTTGTT 


600 


CCCCGTTAAT 


GACTGCTTTT 


TTCATGGCTG 


CAGCTAGTTA 


GCTCGAT GT A 


TCTTCTGTAT 


660 


AT G CAGTGCA 


GCTTCTGCGT 


TTTGGCTGCT 


TTGAGCTGTG 


AAATCTCGCT 


TTCCAGTCCC 


720 


TGCGTGTTTT 


ATAGTGCTGT 


ACGTTCGTGA 


TCGTGAGCAA 


ACAGGGCGTG 


CCTCAACTAC 


780 


TGGTTTGGTT 


GGGTGACAGG 


CGCCAACTAC 


GTGCTCGTAA 


CCGATCGAGT 


GAGCGTAATG 


840 


CAACATTTTT 


TCTTCTTCTC 


TCGCATTGGT 


TTCATCCAGC 


CAGGAGACCC 


GAATCGAATT 


900 


GAAATCACAA 


ATCTGAGGTA 


CAGTATTTTT 


ACAGTACCGT 


TCGTTCGAAG 


GTCTTCGACA 


960 


GGT CAAGGTA ACAAAATCAG 


TTTTAAATTG 


TTGTTTCAGA 


TCAAAGAAAA 


TTGAGATGAT 


1020 


CTGAAGGACT 


TGGACCTTCG 


TCCAATGAAA 


CACTTGGACT 


AATTAGAGGT 


GAATTGAAAG 


1080 


CAAGCAGATG 


CAACCGAAGG 


TGGTGAAAGT 


GGAGTTTCAG 


CATTGACGAC 


GAAAACCTT C 


1140 


GAACGGTATA 


AAAAAGAAGC 


C GCAATT AAA 


CGAAGATTTG 


CCAAAAAGAT 


GCATCAACCA 


1200 


AGGGAAGACG 


TGCATACATG 


TTT GATGAAA 


ACTCGTAAAA 


ACT GAAGT AC 


GATTCCCCAT 


1260 


TCCCCTCCTT 


TTCTCGTTTC 


TTTTAACTGA 


AG C AAAGAAT 


TTGTATGTAT 


TCCCTCCATT 


132 0 


CCATATTCTA 


GGAGGTTTTG 


GCTTTTCATA 


CCCTCCTCCA 


TTTCAAATTA 


TTTGTCATAC 


1380 


ATT GAAGATA 


TACACCATTC 


TAATTTATAC 


TAAATTACAG 


CTTTTAGATA 


CATATATTTT 


1440 


ATTATACACT 


TAGATACGTA 


TTATATAAAA 


CACCTAATTT 


AAAATAAAAA 


ATTATATAAA 


1500 


AAGTGTATCT AAAA^TCAA AAT AC G AC AT AATTT GAAAC 


GGAGGGGTAC 


TACTTATGCA 


1560 
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AACCAATCGT GGTAACCCTA AAC C CTATAT GAATGAGGCC ATGATTGTAA TGCACCGTCT 
GATTAAC CAA GATATCAATG GTCAAAGATA TACATGATAC AT C CAAGT C A CAGCGAAGGC 
AAATGTGACA ACAGTTTTTT TTACCAGAGG GACAAGGGAG AATAT CT ATT C AG AT GT CAA 
GTTCCCGTAT CACACTGCCA GGTCCTTACT CCAGACCATC TTCCGGCTCT ATTGATGCAT 
ACCAGGAATT GAT CTAGAGT CGACCTGCAG GCATGCAAGC TCCTACGCAG CAGGTCTCAT 
CAAGACGATC TACCCGAGTA ACAATCTCCA GGAGAT CAAA TACCTTCCCA AGAAGGTTAA 
AGATGCAGTC AAAAGATTCA GGACTAATTG CATCAAGAAG ACAG AGAAAG ACATATTT CT 
CAAGATCAGA AGTACTATTC CAGTATGGAC GATTCAAGGC TTGCTTCATA AAC CAAGGCA 
AGTAATAGAG ATTGGAGTCT CTAAAAAGGT AGTTCCTACT GAAT CTAAGG CCATGCATGG 
AGT CTAAGAT TCAAATCGAG GATCTAACAG AACTCGCCGT GAAGACTGGC GAACAGTTCA 
TACAGAGT CT TTTACGACTC AATGACAAGA AGAAAATCTT CGTCAACATG GTGGAGCACG 
ACACTCTGGT CTACTCCAAA AATGTCAAAG ATACAGTCTC AGAAG AC CAA AGGGCTATTG 
AGACTTTTCA ACAAAGGATA ATTTCGGGAA ACCTCCTCGG ATTCCATTGC CCAGCTATCT 
GTCACTTCAT CGAAAGGACA GT AGAAAAG G AAGGTGGCTC CTACAAATGC CATCATTGCG 
ATAAAGGAAA GGCTATCATT CAAGATGCCT CTGCCGACAG TGGTCCCAAA GATGGACCCC 
CACCCACGAG GAGCATCGTG GAAAAAGAAG ACGTTCCAAC CACGTCTTCA AAGCAAGTGG 
ATTGATGTGA CATCTCCACT GACGTAAGGG ATGACGCACA ATCCCACTAT CCTTCGCAAG 
ACCCTTCCTC TATATAAGGA AGTTCATTTC ATTTGGAGAG GACACGCTGA AAT CACCAGT 
CTCTCTCTAT AAATCTATCT CTCTCTCTAT AAC CAT GGAC CCAGAAC GAC GCCCGGCCGA 
CATCCGCCGT GCCACCGAGG CGGACATGCC GGCGGTCTGC ACCATCGTCA ACCACTACAT 
CGAGACAAGC ACGGTCAACT TCCGTACCGA GCCGCAGGAA CCGCAGGAGT GGACGGACGA 
CCTCGTCCGT CTGCGGGAGC GCTATCCCTG GCTCGTCGCC GAGGTGGACG GCGAGGTCGC 
CGGCATCGCC TACGCGGGCC CCTGGAAGGC ACGCAACGCC TACGACTGGA CGGCCGAGTC 
GACCGTGTAC GTCTCCCCCC GCCACCAGCG GACGGGACTG GGCTCCACGC TCTACACCCA 
CCTGCTGAAG TCCCTGGAGG CACAGGGCTT CAAGAGCGTG GTCGCTGTCA TCGGGCTGCC 
CAACGACCCG AGCGTGCGCA TGCACGAGGC GCTCGGATAT GCCCCCCGCG GCATGCTGCG 
GGCGGCCGGC TTCAAGCACG GGAACTGGCA TGACGTGGGT TTCTGGCAGC TGGACTTCAG 
CCTGCCGGTA CCGCCCCGTC CGGTCCTGCC CGTCACCGAG ATCTGATCTC ACGCGTCTAG 
GATCCGAAGC AGATCGTTCA AACATTTGGC AATAAAGTTT CTTAAGATT G AATCCTGTTG 
CCGGTCTTGC GATGATTATC ATATAATTTC TGTTGAATTA CGTTAAGCAT GTAATAATTA 
ACATGTAATG CATGACGTTA TT TAT GAG AT GGGTTTTTAT GATTAGAGTC CCGCAATTAT 



1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 
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ACATTTAATA CGCGATAGAA AACAAAATAT AGCGCGCAAA . CT AG GAT AAA TTATCGCGCG 34 80 

CGGTGTCATC TAT GTTACTA GATCGGGAAG ATCCTCTAGA GTCGACCTGC AGGCATGCAA 354 0 

GCTT 354 4 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: oligonucleotide 1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CGTTTCTCGA ATCCGACGAG G 21 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 824 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: plasmid pCOL9 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 396. .401 

(D) OTHER INFORMATION :/label= EcoRI 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 23 67. .2379 

(D) OTHER INFORMATION :/label= Sfil 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 884. .888 

(D) OTHER INFORMATION: /labels Cl-S 

/note= "TGCAG (in CI) which in Cl-S allele is replaced with 
TTAGG " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TCGCGCGTTT CGGTGATGAC GGTGAAAACC T CT GACACAT GCAGCTCCCG GAGACGGTCA 60 

CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG TCAGGGCGCG TCAGCGGGTG 120 

TTGGCGGGTG TCGGGGCTGG CTTAACTATG C G G CAT CAGA GCAGATTGTA CTGAGAGTGC 180 
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ACCATATGCG 


GTGTGAAATA 


CCGCACAGAT 


GCGTAAfiflAf; 




ATCAGGCGCC 


240 


ATTCGCCATT 


CAGGCTGCGC 


AACTGTTGGG 


AAGGGPGATC 


oox oL-bobLL 


ILI 1 L.GCTAT 


300 


TACGCCAGCT 


GGCGAAAGGG 


GGATGTGCTG 


CAAG G C G ATT 


AAGTTGGGTA 


ACGCCAGGGT 


360 


TTTCCCAGTC 


ACGACGTTGT 


AAAACGACGG 


CCAGTGAATT 


CGCTTACGGT 


CTCAAACAAG 


420 


CAATTTACAC 


TCAGTTGGTT 


GTAATATGTG 


GACAATAAAA 


CTACAAACTA 


GACACAAATC 


480 


ATACCATAGA 


CGGAGTGGTA 


GCAGAGGGTA 




TGAGATAGAG 


GATTCTCCTA 


540 


AAATAAATGC 


ACTTTAGATG 


GGTAGGGTGG 




CTCCTAAAAT 


GAAACTCGTT 


600 


TAATGTTTCT 


AAAAATAGTT 


TTCACTGGTG 


rtl ^.L. 1 Inb 1 1 


ACTGGCATGT AAAAATGATG 


660 


ATTTCTACTG 


TCTCTCATAT 


GGACGGTTAT 




ATTATATTGA AAATAGGTCT 


720 


CTGCTGCTAC 


ACTCGCCCTC 


ATAGCAGATC 




GCATCATTCG 


AT CAGTTTTC 


780 


GTTCTGATGC 


AGTTTTCGAT 


AAATGCCAAT 


* * x X A nn'w 1 vj 


CATACGTTGC 


CCTTGCTCAG 


840 


CACCAGCACA 


GCAGTGTCGT 


GTCGTCCATG 


CATGCACTTT 


AGGTGCAGTG 


CAGGGCCTCA 


900 


ACTCGGCCAC 


GTAGTTAGCG 


CCACTGCTAC 


AG AT r a a r* 


ACCGGTCAGC 


CGGCCACGCA 


960 


CGTCGACCGC 


GCGCGTGCAT 


TTAAATACGC 


CGACGACGGA 


GCTTGATCGA 


CGAGAGAGCG 


1020 


AGCGCGATGG 


GGAGGAGGGC 


GTGTTGCGCG 


AAGGAAGGCG 


TTAAGAGAGG 


GGCGTGGACG 


1080 


AGCAAGGAGG ACGATGCCTT 


GGCCGCCTAC 


GTCAAGGCCC 


ATGGCGAAGG 


CAAATGGAGG 


1140 


GAAGTGCCCC 


AGAAAGCCGG 


TAAAACTAGC 


TAGTCTTTTT 


ATTTCATTTT 


GGGATCATAT 


1200 


ATATACCCCC 


GAG GC AAGAC 


CGGAGGACGA 


TCACGTGTGT 


GGGTGCAGGT 


TTGCGTCGGT 


1260 


GCGGCAAGAG 


CTGCCGGCTG 


CGGTGGCTGA ACTACCTCCG 


GCCCAACATC AGGCGCGGCA 


1320 


ACATCTCCTA 


CGACGAGGAG 


GATCTCATCA 


TCCGCCTCCA 


CAGGCTCCTC 


GGCAACAGGT 


1380 


CTGTGCAGTG 


GCCAGTGGTG 


GGCTAGCTTA 


TTACACGAGC 


TGACGACGAG 


GCGATCGATC 


1440 


GAGCGTCTGC 


TGCGAATTCA 


TCTGTTCCGG TGTCGGCCGT GTGAGAGTGA 


GCTCATTCAT 


1500 


ATGTACATGC 


GTGTTGGCGC 


GCAGGTGGTC 


GCTGATTGCA 


GGCAGGCTGC 


CTGGCCGAAC 


1560 


AGACAATGAA AT CAAGAACT 


ACT GGAACAG 


CACGCTGGGC 


CGGAGGGCAG 


GCGCCGGCGC 


1620 


CGGCGCCGGC 


GGCAGCTGGG 


TCGTCGTCGC 


GCCGGACACC 


GGCTCGCACG 


CCACCCCGGC 


168 0 


CGCGACGTCG GGCGCCTGCG 


AGACCGGCCA 


GAATAGCGCC 


GCTCATCGCG 


CGGACCCCGA 




CTCAGCCGGG ACGACGACGA 


CCTCGGCGGC 


GGCGGTGTGG 


GCGCCCAAGG 


CCGTGCGGTG 


1 O f\ t\ 

1800 


CACGGGCGGA 


CTCTTCTTCT 


TCCACCGGGA 


CACGACGCCG 


GCGCACGCGG 


GCGAGACGGC 


1860 


GACGCCAATG 


GCCGGTGGAG 


GTGGAGGAGG 


AGGAG GAGAA 


GCAGGGTCGT CGGACGACTG 


1920 


CAGCTCGGCG 


GCGTCGGTAT 


CGCTTCGCGT 


CGGAAGCCAC 


GACGAGCCGT 


GCTTCTCCGG 


1980 


CGACGGTGAC 


GGCGACTGGA 


TGGACGACGT 


GAGGGCCCTG 


GCGTCGTTTC 


TCGAGTCCGA 


2040 


CGAGGACTGG 


CTCCGCTGTC 


AGACGGCCGG 


GCAGCTTGCG 


TAGACAAC AA 


GTACACGTAT 


2100 
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AGATGTCCAA 


TAAGCACGAG 


GCCCGCGAGC 


CCGGCACGAA 


GCCCGCTTTT 


TGGGCCCGGT 


2160 


CCGAGCCCGG 


CACGGCCCGG 


TTATATGCAG 


ACCCGGGCCG 


GCCCGGCACG 


AATAAGCGGG 


2220 


CCGGGCTCGG 


ACAGGAAATT 


AGGCACGGTG 


AGCTAGCCCG 


GCACGGCCCG 


TTTAGGTCTA 


2280 


AGCCCGTTAA 


GCCCGTTTTT 


TTACACTAAA 


ACGTGCTTCT 


CGGCCCGCAT 


AGCCCGCTTC 


2340 


TCGGCCCGCT 


TTTTTCGT GC 


TAAACGGGCC 


GGCCCGGCCC 


GGTTTAGGCC 


CGTTGCGGGC 


2400 


CGGGCTCGGA 


CAGGAAATTG 


AGCCCGCGTG 


CTTAGCCGGC 


CCGGCCCGGT 


TTTTTAATCG 


2460 


TGCCTGGCGG 


GCCAGGCCCA AAACGGGCCG 


GGCTTCACCG 


GGCCCGGGCC 


GGACCGGGCC 


2520 


GGGCGGCCCG 


TTTGGACATC 


TCTAAGTACA 


CGTATGGAGG 


AGAATATATA 


TATAGTCATG 


2580 


CGTACAGCTT 


GGCGTAATCA 


TGGTCATAGC 


TGTTTCCTGT 


GTGAAATTGT 


TATCCGCTCA 


2640 


CAATTCCACA 


CAACATACGA 


GCCGGAAGCA 


TAAAGTGTAA 


AGCCTGGGGT 


GCCTAATGAG 


2700 


TGAGCTAACT 


CACATTAATT 


GCGTTGCGCT 


CACTGCCCGC 


TTTCCAGTCG 


GGAAACCTGT 


2760 


CGTGCCAGCT 


GCATTAATGA ATCGGCCAAC 


GCGCGGGGAG 


AGGCGGTTTG 


CGTATTGGGC 


2820 


GCTCTTCCGC 


TTCCTCGCTC 


ACTGACTCGC 


TGCGCTCGGT 


CGTTCGGCTG 


CGGCGAGCGG 


2880 


TAT CAGCTCA 


CTCAAAGGCG 


GTAATACGGT 


TATCCACAGA 


ATCAGGGGAT 


AAC GCAGGAA 


2940 


AGAACATGTG 


AGCAAAAGGC 


CAGCAAAAGG 


CCAGGAACCG 


TAAAAAGGCC 


GCGTTGCTGG 


3000 


CGTTTTTCCA 


TAGGCTCCGC 


CCCCCTGACG 


AG CAT CACAA 


AAATCGACGC 


T CAAGTCAGA 


3060 


GGTGGCGAAA 


CCCGACAGGA 


CTATAAAGAT 


ACCAGGCGTT 


TCCCCCTGGA 


AGCTCCCTCG 


3120 


TGCGCTCTCC 


TGTTCCGACC 


CTGCCGCTTA 


CCGGATACCT 


GTCCGCCTTT 


CTCCCTTCGG 


3180 


GAAGCGTGGC 


GCTTTCTCAA 


TGCTCACGCT 


GTAGGTATCT 


CAGTTCGGTG 


TAGGTCGTTC 


3240 


GCTCCAAGCT 


GGGCTGTGTG 


CACGAACCCC 


CCGTTCAGCC 


CGACCGCTGC 


GCCTTATCCG 


3300 


GTAACTATCG 


TCTTGAGTCC 


AACCCGGTAA 


GACACGACTT 


ATCGCCACTG 


GCAGCAGCCA 


3360 


CTGGTAACAG 


GATTAGCAGA 


GCGAGGTATG 


TAGGCGGTGC 


TACAGAGTTC 


TTGAAGTGGT 


3420 


GGCCTAACTA 


CGGCTACACT 


AGAAGGACAG 


TATTTGGTAT 


CTGCGCTCTG 


CTGAAGCCAG 


3480 


TTACCTTCGG 


AAAAAGAGTT 


GGTAGCTCTT 


GATCCGGCAA 


ACAAACCACC 


GCTGGTAGCG 


3540 


GTGGTTTTTT 


TGTTTGCAAG 


CAGCAGATTA 


CGCGCAGAAA 


AAAAGGATCT 


CAAGAAGATC 


3600 


CTTTGATCTT 


TTCTACGGGG 


TCTGACGCTC 


AGTGGAACGA 


AAACTCACGT 


TAAGGGATTT 


3660 


TGGTCATGAG 


ATTATCAAAA AGGATCTTCA 


CCTAGATCCT 


TTTAAATTAA 


AAAT GAAGTT 


3720 


TTAAATCAAT 


CTAAAGTATA 


TATGAGTAAA 


CTTGGTCTGA 


CAGTTACCAA 


TGCTTAATCA 


3780 


GTGAGGCACC 


TATCTCAGCG 


ATCTGTCTAT 


TTCGTTCATC 


CATAGTTGCC 


TGACTCCCCG 


3840 


TCGTGTAGAT 


AACTACGATA 


CGGGAGGGCT 


TACCATCTGG 


CCCCAGTGCT 


GCAATGATAC 


3900 


CGCGAGACCC 


ACGCTCACCG 


GCTCCAGATT 


TATCAGCAAT 


AAACCAGCGA 


GCCGGAAGGG 


3960 
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CCGAGCGCAG 


AAGTGGTCCT 


GCAACTTTAT 


CCGCCTCCAT 


CCAGTCTATT 


AATTGTTGCC • 


4020 


GGGAAGCTAG 


AGTAAGTAGT 


TCGCCAGTTA ATAGTTTGCG 


CAACGTTGTT 


t^LAJAI TGCTA 


4080 


CAGGCATCGT 


GGTGTCACGC 


TCGTCGTTTG 


GTATGGCTTC 


ATTCAGCTCC 


GGTTCCCAAC 


414 0 


GATCAAGGCG AGTTACATGA 


TCCCCCATGT 


TGTGCAAAAA AGCGGTTAGC 


TCCTTCGGTC 


4200 


CTCCGATCGT 


TGTCAGAAGT 


AAGTTGGCCG 


GAGTGTTATC 


ACTCATGGTT 


ATGGCAGCAC 


4260 


TGCATAATTC 


TCTTACTGTC 


ATGCCATCCG 


TAAGATGCTT 


TTCTGTGACT 


GGTGAGTACT 


4320 


CAACCAAGTC 


ATTCTGAGAA 


TAGTGTATGC 


GGCGACCGAG 


TTGCTCTTGC 


CCGGCGTCAA 


4380 


TAC GGGATAA 


TACCGCGCCA 


CATAGCAGAA 


CTTTAAAAGT 


GCTCATCATT GGAAAACGTT 


4 4 40 


CTTCGGGGCG 


AAAACTCTCA 


AGGATCTTAC 


CGCTGTTGAG 


ATCCAGTTCG 


ATGTAACCCA 


4500 


CTCGTGCACC 


CAACTGATCT 


TCAGCATCTT 


TTACTTTCAC 


CAGCGTTTCT 


GGGTGAGCAA 


4560 


AAACAGGAAG 


GCAAAATGCC 


GCAAAAAAGG 


GAATAAGGGC 


GACACGGAAA 


TGTTGAATAC 


4620 


TCATACTCTT 


CCTTTTTCAA 


TATTATTGAA GCATTTATCA GGGTTATTGT 


CTCATGAGCG 


.4680 


GATACATATT 


TGAAT GTATT 


TAGAAAAATA AACAAATAGG 


GGTTCCGCGC 


ACATTTCCCC 


4740 


GAAAAGTGCC 


ACCTGACGTC 


TAAGAAACCA 


TTATTATCAT 


GACATTAAC C 


TATAAAAATA 


4800 


GGCGTATCAC 


GAGGCCCTTT 


CGTC 








4824 


(2) INFORMATION FOR SEQ ID NO: 6: 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3915 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: EcoRI-HindHI region of plasmid pCOL13 

(ix) FEATURE: 

(A) NAME /KEY : prim_transcript 

(B) LOCATION: 18 8 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 188. .212 

(ix). FEATURE: 

(A) NAME/ KEY: intron 

(B) LOCATION:213. .556 

(ix) FEATURE: 

(A) NAME/ KEY : exon 

(B) LOCATION: 557. ,718 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION:7l9. .1224 
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(ix) FEATURE: 

(A) NAME /KEY: exon 

(B) LOCATION: 1225. .2770 

(D) OTHER INFORMATION :/codon_st a rt= 2 

/note= "exon containing 3* end coding region of B-peru gene" 

<ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION: 576. . 718 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1225. .2770 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 1268. .2770 

(D) OTHER INFORMATION : /note= n 3' end of B-peru coding 
region which is derived from cDNA" 

(ix) FEATURE: 

(A) NAME/ KEY: 3 * UTR 

(B) LOCATION: 2771. .3272 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 3273. . 3891 

(D) OTHER INFORMATION: /label= 3 • region 

/note= "further 3 1 flanking region of B-peru gene. This region 
is only of approximate length and the sequence needs to be 
confirmed." 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 1. .6 

(D) OTHER INFORMATION: /label= EcoRI 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 11. .16 

(D) OTHER INFORMATION: /label= Xbal 

(ix) FEATURE: 

(A) NAME/ KEY : - 

(B) LOCATION: 45. .50 

(D) OTHER INFORMATION: /label= Kpnl 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION:265. .270 

(D) OTHER INFORMATION: /labels Hindlll 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 329. .334 

(D) OTHER INFORMATION: /label= Xbal 

(ix) FEATURE: 

(A) NAME/ KEY : - 

(B) LOCATION: 835. .840 

(D) OTHER INFORMATION: /label= BamHI 
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(ix) FEATURE: 

(A) NAME/ KEY: - 

<B) LOCATION: 1268. . 1273 

(D) OTHER INFORMATION :/label= Mlul 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 2787. .27 92 

(D) OTHER INFORMATION :/label= Hindlll 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 2883. .2888 

(D) OTHER INFORMATION: /labels Muni 

(ix) FEATURE: 

(A) NAME/KEY: -. 

(B) LOCATION: 2827. .2832 

(D) OTHER INFORMATION: /label= Hindlll 

(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 3892. .38 97 

(D) OTHER INFORMATION: /labels Sail 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 3910. .3915 

(D) OTHER INFORMATION: /label= Hindlll 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 3892. .3915 

(D) OTHER INFORMATION :/label= polylinker 
/note= "part of polylinker of pUC19" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



GAATTCAGGT 


TCTAGACTAT 


TCTTGTGGCC 


TCGGGCGGAT 


GGCGGGTACC 


CATGTCTTCG 


60 


TTAGGCTTAT 


CTGACCGTGG 


AGATGAAATC 


TAACGGCTCA 


TAGAAATTAA 


ACTAACGTGG 


12 0 


ACACTCTGTC 


CTTGCTGTTT 


TGCTCCCTGC 


TCTTTATATA 


TAGAATGCCT 


GCTTGCATTG 


180 


CACCCGTACG 


TACAGC GTAG 


CGCGGAGTGG 


AGGTGAGCTC 


CTCCTCCGAT 


TCTTGCCTAA 


240 


TCTTTGGTCT 


TTGCACACGT 


ACGAAAGCTT 


TTTGCATTGT 


TTCGTTGCTT 


CTGGATGATC 


300 


AGTACTCTTA 


GATATTAAGC 


GATACCGATC 


TAGAATCGAG 


TTGTTGTACT 


CTCTCTGTCC 


360 


CTTTTGTGCA 


GCTATAACTA 


GCTAGGTTCC 


TTCGCATAGA 


GCCTCTCTAC 


AGAGTACAGA 


420 


CTAGCTAGCA 


GTGTCAGACA 


CGAAATGGAA 


ATGGTCACTT 


CCAAATTGCA 


CGAGCTGGAA 


480 


TTATATACTC 


TTCTGATCTT 


CTTCACCGTC 


TCTTTATAGC 


GTGATATGCG 


TTTCTGGCTT 


540 


CTTGCTTACG 


TGAAGGATTA 


TTAGTAAGGC 


GCGTGATGGC 


GCTCTCAGCT 


TCCCCGGCTC 


600 


AGGAAGAACT 


GCTGCAGCCT 


GCTGGGAGGC 


CGTTGAGGAA 


GCAGCTTGCT 


GCAGCCGCGA 


660 


GGAGCAT CAA 


CTGGAGCTAT 


GCCCTCTTCT 


GGTCCATTTC 


AAGCACTCAA 


CGACCTCGGT 


720 


AAATGGAAGT 


CCTGATAATC 


TATAATTTGT 


CTGGCAGTTT 


TCTACAACTC 


TGGTGAATGA 


780 
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1 LvJl wn w IH- 


GTTTGCCTGA 


TACATACATA 


C AT AC AT AT G 


AAATAAAGAA 


AGTCGGATCC 


840 


PPTGATGPGA 

WVJ 1 VjAl VJwWAV 


TTf^T A GTT AT 

1 1 O X AO X 1 A A 


CGCTTTTCCG 

vU^l XXX Www 


CAAAAT GGTT 


GCTTTTTGAA 


TCTGCATTCG 


900 


111111 1 www 


APATPTTPTT 
nun l w 1 i^i l 


CCTTCTCGPG 


AGT AAP G AC A 


ACGCCACCCG 


CGCCGCCTGC 


960 


p p p pp a t p n p 

HowLwAX LuL 


LLtoLL 1 1 w O 


PPGGPGAGAG 


PPTPAGPPTA 


TTACACCAGC 

X X AW»v> wow W 


GGCGACCTCT 

www uii^ X V* X 


1020 


111 LLLU lit 


Li^l waV-»ww> w 


w w X w W 1 ww w w 


GTGPTPTPPP 

V3 X OV~» 1 V-> 1 www 


PPGCTCTAAC 

\_ w ww x w innw 


PTGGTPTGGC 

w x ww x w x www 


1080 


w w w w 1 w w w- w 1 


uuUnLL 1 bL 1 


rrGnrfznrrT 

LV/UO^OOt^ X 


wAVwwwOwOl w 


TTTCTCGTCC 

X X IwXwWXww^ 


CTAC C CT CT C 

w X ^\w w w X w X w 


1140 


I L»w w 1LI Ljxjtj 


PPPATP ATP A 
LuLA 1 wAl wa 


TPTGATATTP 
itl wa Xa.1 X w 


TG ATGPA A AT 
x uAl otnnn.1 


AAAAAAGGTA 


TAPPATATAA 

X Aw wA X AV X AVAV 


1200 


o ovn w/\n wn oa\ 


A A AT ATPPTT 
A/*»nx AX OOl I 


PP APPPTPPT 
OwAo oo 1 bL 1 . 


PAPPTPP.APG 


GAPGGGTTPT 

OnwOwwl Xwl 


APAATGGPGA 

AwAAl Owv^UA . 


1260 


rrfpr;7\ a papp 
ww l waawaww 


PPTA AP ATPT 
tu X nnonl w 1 


PPPAPTPPPT 


GGAGPTGAPA 


GPPGACCAGC 

W w \~r On^ wAVww 


TGCTCATGCA 

X ww X w>Afc x w un 


1320 


gaggagpgag 

VxAw wAVw w w A w 


PAGPTPPGGG 


AGPTPTAPGA 


X Uw\9w 


TCCGGCGAGT 


GCGACCGCCG 


1380 


CGGCGCGCGG 


PCGGTGGGCT 

W w W W X W^ VJ W V-*- X 


CGCTGTCGCC 


GGAGGACCTC 


GGG GACACCG 


AGTGGTACTA 


1440 


CGTGAT CT G C 

W»W X W^ A V v 


ATGACCTACG 


CCTTCCTGCC 


GGGCCAAGGC 


TTGCCCGGCA 


GGAGTTCCGC 


1500 


GAGCAACGAG 


CATGTCTGGC 

WTVX W X W X W WW 


TGTGCAACGC 


GCACCTCGCC 


GGCAGCAAGG 


ACTTCCCCCG 


1560 


V\J\<UVrf 1 w w 1 w 


GPPAAGAGCG 

w w> vAAvJ/VJ W w 


CGTCCATTCA 
v*v3 x v«v«ai x un 


GACAATCGTC 

Vj^\w^VT% X w\J X w 


TGCATCCCGC 

x wwXkX wwwww 


TCATGGGTGG 

M ^w*J L X Vvwi WW 


1620 


PGTGPTTGAG 

WW X W w 1 1 wAw 


PTTGPT APT A 
w X X ww X Aw x A 


PTGATAAGGT 

w 1 WAV X AAOv 1 


GPPGGAGGAP 

V9 w w- w VJAVJ w 


PPGGAPTTGG 

wwwwV^w X X ww 


TCAGCCGAGC 

X W^kWW-WW^VWW 


1680 


a a p ppt a rr zv 


TTPTPPPAPP 
i 1 LI w^wwiAw" w 


PPP A ATPTPP 
wljwAAl Ij 1 ww 


nAPAT APTPP 


A A AGAGPPGA 

AAAUAU w w> wAV 


GPTPPAAPPP 

O w X w wAVAVw w w 


1740 


w 1 LAvaWilAC 




aavjwHjUaIa 


PATAPTPPTP 


TTPP A PP A PP 
1 1 wwnwonww 


TPGATPAPAA 

X wwAl wAwAA 


X O \J \J 


1 bU^A i V»waL, 


7VTPPAP APPP 
/\x bbnbnLoo 


TPAPTPPPPP 
1 UAL i w'www'w 


PPPPPPfZ AP A 


PAPPPA APPG 
Wvf\ w w w/v\w W w 


GAPAGGAGPT 

wAwAw wAVw w X 


1860 


a^VjawAAVj 1 w 


P AP APPPPPT 


PAA ATGPA AG 


PPTGGAGPAP 
ww 1 «uAbW\t 


ATPAPPAAGG 

/\X wAwwAnwU 


GGATCGACGA 

wwA X w wAw WAV 


1920 


1 v lAtnot 


PTPTGPGAGG 
w l w l o^unoo 


AAATGGAPGT 
Ann x vuAvu x 


GCAGCCGCTA 


GAGGATGCCT 

wnww^^x www x 


GGATAATGGA 


1980 


CGGCTfTAAT 

w WW W 1 X AVA 1 


TTPGAAGTPP 
x x v-»vsaaw x w w 


CGTCGTCAGC 

W X w W X VAOU 


GCTCCCGGTG 

WW X WW WWW X w 


GATGGCTCAA 


GCGCACCCGC 


2040 


T GAT GGTT PT 

X ww 1 1 w X 


PGP GPGAPAA 


GTTTPGTGGT 

W XXX w VJ X \J<J X 


TTGGACGAGG 


TCATCGCACT 


CCTGCTCGGG 


2100 


TGAAGPGGPG 

A VJArVoV* WW w W 


GTGPPGGTCA 

W X x wa. 


T C GAAG AG C C 


GCAGAAATTG 

W ^^jV^J-dkJ U i> X X w 


CTGAAGAAAG 


CGTTGGCCGG 


2160 


CGGCGGTGPT 

w WwTWwW 1 O w 1 


TGGGCGAACA 


CGAACTGCGG 


TGGCGGGGGC 

X ww^wwwwwww 


ACGACGGTAA 


CAGCCCAGGA 


2220 


AAACGGCGCC 


AA GAAP PAP G 


T CAT GT CAGA 


GCGAAAGCGC 


CGGGAGAAGC 


TCAACGAGAT 


2280 


GTTCCTCGTT 


CTCAAGTCGT 


TGGTTCCCTC 


CATTCACAAG 


GTGGACAAAG 


CATCCATCCT 


2340 


CGCCGAAACG 


ATAGCCTATC 


TAAAGGAGCT 


TCAACGAAGG 


GTACAAGAAC 


TGGAATCCAG 


2400 


GAGGCAAGGT 


GGCAGTGGGT 


GTGTCAGCAA 


GAAAGTCTGT 


GTGGGCTCCA 


ACTCCAAGAG 


2460 


GAAGAGCCCA 


GAGTTCGCCG 


GTGGCGCGAA 


GGAGCACCCC 


TGGGTCCTCC 


CCATGGACGG 


2520 


CACCAGCAAC 


GTCACCGTCA 


CCGTCTCGGA 


CACGAACGTG 


CTCCTGGAGG 


TGCAATGCCG 


2580 


GTGGGAGAAG 


CTCCTGATGA 


CACGGGTGTT 


CGACGCCATC 


AAGAGCCTCC 


ATTTGGACGC 


2640 
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TCTCTCGGTT 


CAGGCTTCGG 


CACCAGATGG 


CTTCATGAGG 


CTCAAGATAG 


GAGCT CAGTT 


2700 


TGCAGGCTCC 


GGCGCCGTCG 


TGCCCGGAAT GATCAGCCAA TCTCTTCGTA AAGCTATAGG 


2760 


GAAGCGATGA 


AAGGGCGCTA 


CATGTGAAGC 


TTAATTAATG 


GAAGCAAACT 


TGTATTTCTT 


2820 


GTGCAAAAGC 


TTACTATATA 


TTTCTGCAAA 


ACCTGGTGTG 


CCTTGTTTTG 


ATTTTCAGTC 


2880 


GCCAATTGTG 


CCTTTGTTTT 


TATCAAGTGA 


T GAT CTACAC 


ATATATATAG 


GAATATTTGA 


2940 


AAAGAGC GAT 


GTCATAGGGT 


TTTTTTATTA 


CAAGGAACAA 


GTCTTTCACG 


TGCTGGCCTC 


3000 


ACAAATCCTA 


AGAGAAAATC 


TGCTCATTTT 


GATTGCGTTC 


CGCAACAACT 


CTGTAATCCA 


3060 


TATCCTATGT 


ATCCGATCAA 


CTAGTCGATA 


GCCTCCGTCC 


GCCACAT CAT 


CATATATCTA 




TCTATGTGTG 


TCATCTGACA 


CATACTCCTC 


GCGTACTGTG 


CTGACATATG 


AT ACT GACAC 




AG CATATATG 


CATGCACATC 


GTCACACGAC 


ATATATCTCG 


CTACTACACA 


GATATTGGAT 




AC GATACT AT 


ATAGCATCAT 


GCGTGCTGCG 


ATNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


JJUU 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


-J «J W \J 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3420 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3480 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3540 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


ODUU 


NNNNNNNNNN 


NNNNNNNNNN 

IX IX IX 4X IX IX lx J X IX ix 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3660 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3720 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3780 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


3840 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NNNNNNNNNN 


NGTCGACCTG 


3900 


CAGGCATGCA 


AGCTT 










3915 



(2) INFORMATION FOR SEQ ID NO: 7: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4137 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: EcoRI-Hindlll region of plasmid pCOL13 

(ix) FEATURE: 

(A) NAME/KEY: prim_transcript 

(B) LOCATION: 188 

(ix) FEATURE: 

(A) NAME/ KEY: exon 
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<B) LOCATION: 188. .212 

(ix) FEATURE: 

(A) NAME/ KEY: intron 

(B) LOCATION:213. .556 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION: 557. .718 

(ix) FEATURE: 

(A) NAME/ KEY: intron 

(B) LOCATION: 719. .1224 

(ix) FEATURE: 

(A) NAME/ KEY: exon 

(B) LOCATION : 1226 . . 2771 

(D) OTHER INFORMATION: /codon_start= 2 

/note= "exon containing 3 1 end coding region of B-peru gene, 
this exon continues up to the polyadenylation site." 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 576. .718 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1226. .2771 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 1269. .2771 

(D) OTHER INFORMATION: /no te= "fragment of B-peru coding 
region which is derived from cDNA" 

(ix) FEATURE: 

(A) NAME/ KEY: 3'UTR 

(B) LOCATION: 2772. .4137 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION : 1 . . 6 

(D) OTHER INFORMATION: /label= EcoRI 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 11. .16 

(D) OTHER INFORMATION :/label= Xbal 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 4 5. .50 

(D) OTHER INFORMATION :/label= Kpnl 

(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION:265. .270 

(D) OTHER INFORMATION :/label= Hindlll 

(ix) FEATURE: 

. (A) NAME/ KEY: - 
(B) LOCATION: 32 9. .334 
(D) OTHER INFORMATION :/label= Xbal 
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fix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 835. .840 

-(D) OTHER INFORMATION: /labels BamHI 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 1269. .1274 

(D) OTHER INFORMATION: /label= Mlul 

(ix) FEATURE: 

(A) NAME/ KEY : - 

(B) LOCATION: 2788. .2793 

(D) OTHER INFORMATION :/label= Hindlll 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 2884. .2889 

(D) OTHER INFORMATION :/label= Muni 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 2 828. .2 833 

(D) OTHER INFORMATION: /label- Hindlll 

(ix) FEATURE: 

(A) NAME/ KEY: - 

(B) LOCATION: 4114. .4119 

(D) OTHER INFORMATION :/label= Sail 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 4132. . 4137 

(D) OTHER INFORMATION :/label= Hindlll 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 4114. . 4137 

(D) OTHER INFORMATION :/label= polylinker 
/note= "part of polylinker of pUC19 M 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



GAATTCAGGT 


TCTAGACTAT 


TCTTGTGGCC 


TCGGGCGGAT 


GGCGGGTACC 


CATGTCTTCG 


60 


TTAGGCTTAT 


CTGACCGTGG 


AGAT GAAATC 


TAACGGCTCA 


TAGAAATTAA ACTAACGTGG 


120 


ACACTCTGTC 


CTTGCTGTTT 


TGCTCCCTGC 


TCTTTATATA 


TAGAATGCCT 


GCTTGCATTG 


180 


CACCCGTACG 


TACAGCGTAG 


CGCGGAGTGG 


AGGTGAGCTC 


CTCCTCCGAT 


TCTTGCCTAA 


240 


TCTTTGGTCT 


TTGCACACGT 


ACGAAAGCTT 


TTTGCATTGT 


TTCGTTGCTT 


CTGGATGATC 


300 


AGTACTCTTA 


GATATTAAGC 


GATACCGATC 


TAGAAT CGAG 


TTGTTGTACT 


CTCTCTGTCC 


360 


CTTTTGTGCA 


GCTATAACTA 


GCTAGGTTCC 


TTCGCATAGA 


GCCTCTCTAC 


AGAGTACAGA 


420 


CTAGCTAGGA 


GTGTCAGACA 


CGAAATGGAA 


ATGGTCACTT 


CCAAATTGCA 


CGAGCTGGAA 


480 


TTATATACTC 


TTCTGATCTT 


CTTCACCGTC 


TCTTTATAGC 


GTGATATGCG 


TTTCTGGCTT 


540 


CTTGCTTACG 


TGAAGGATTA 


TTAGTAAGGC 


GCGTGATGGC 


GCTCTCAGCT 


TCCCCGGCTC 


600 
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AG GAAGAACT 


GCTG w\Cav~V 1 






GCAGCTTGCT 


GCAGCCGCGA 


660 


G GAGCAT CAA 


CT GG AGC1 A I 


GCCCTCTTCT 


CjCj 1 V^v-Al lit 


AAGCACTCAA 


CGACCTCGGT 




AAAT GGAAGT 


CCTGATAATC 


TATAATTT GT 


CT GG CAGTTT 


TCTACAACTC 


TGGTGAATGA 


"7 ft fi 

/OU 


TCGTCACTTC 


GTTTGCCTGA 


TACATACATA 


CATACATATG 


AAATAAAGAA AGTCGGATCC 


O^U 


CGTGATGCGA 


TTGTAGTTAT 


CGCTTTTCCG 


CAAAATGGTT 


GCTTTTTGAA 


TCTGCATTCG 


q f\ r\ 


TTTTTTTCCC 


ACATCTTCTT 


CCTTCTCGCG 


AGTAACGACA 


ACGCCACCGC 


GCGCCGCCTG 


o c n 

you 


CCGCCCATCG 


CCCCGCCTTG 


/*• z~» z*» z"» /■-> t\ /*■ i\ 

GCCGGCGAGA 


Z"*» Z"*W1 Z" TV Z*"* Z *• I > 

GCCTCAGCCT 


ATTACACCAG 


CGGCGACCTC 


1 uz u 


TTTTCCCCTT 


z»* z"»m /"•m z"» z*"» z"^» z-» 

CCTCTCACCG 


Z"> z-» z"* *T> z"* ZTT» /""*/-• Z""» 

CCCTCGTGGC 


CGTGCTCACC 


CCCGCTCTAA 


CCTGGTCTGG 


iUDU 


CCGCCTCCGC 


TGCCACCTGC 


TCCGGCGGCC 


TCACCCGCGT 


CTTTCTCGTC 


C CT AC C CT CT 


X JL H. U 


CTGCCTCTGG 


GCGCATCATC 


7\ m rn z— *»\ m Tv mm 

AT CT GAT ATT 


CTGATGCAAA 


GAAAAAAGGT 


ATACCATATA 


1ZU u 


AGGACAACAG 


AAAATATGGT 


TGCAGGGTGC 


TGACGTGGAC 


GGACGGGTTC 


TACAATGGCG 




AGGT GAAGAC 


GCGTAAGATC 


TCCCACTCCG 


TGGAGCTGAC 


AGCCGACCAG 


CTGCTCATGC 




AGAGGAGCGA 


GCAGCTCCGG 


GAG CTCT AC G 


AGGCCCTCCG 


GTCCGGCGAG 


TGCGACCGCC 




GCGGCGCGCG 


GCCGGTGGGC 


TCGCTGTCGC 


CGGAGGACCT 


CGGGGACACC 


GAGT GGTACT 


1 il zt A 


ACGTGATCTG 


CATGACCTAC 


GCCTTCCTGC 


CGGGCCAAGG 


CTTGCCCGGC 


AGGAGTTCCG 


1 RAH 


CGAGCAAC GA 


GCATGTCTGG 


CTGTGCAACG 


CGCACCTCGC 


CGGCAGCAAG 


GACTTCCCCC 


lobU 


GG GCGCTCCT 


GGCCAAGAGC 


z~« z*« /^•rr»z*>/"^» t\ mm z*» 

GCGTCCATTC 


AGACAAT CGT 


CTGCATCCCG 


CTCATGGGTG 




GCGTGCTTGA 


GCTTGGTACT 


ACTGATAAGG 


TGCCGGAGGA 


CCCGGACTTG 


GTCAGCCGAG 


1 Z-ZJZi 


CAACCGTAGC 


ATTCTGGGAG 


CCGCAATGTC 


CGACATACTC 


GAAAGAGCCG AGCTCCAACC 




CGTCAGCATA 


CGAAACCGGG 


GAAGCCGCAT 


AC ATAGT C GT 


GTTGGAGGAC 


CT C GAT CACA 


"1 O A A 

1 o OU 


ATGCCATGGA 


CAT GGAGACG 


GTGACTGCCG 


C CG C C GtjCaAvj 


ACACGGAACC 


GGACAGGAGC 


1 ft £fi 


TAGGAGAAGT 


CGAGAGCCCG 


1 bLAA 




CAT CAC CAAG 


GGGATCGACG 




AGTTCTACAG 


CCTCTGCGAG 






AGAGGATGCC 


TGGATAAT GG 




ACGGGTCTAA TTTCGAAGTC 


1 1 i*/\Vs 




GGATGGCTCA AGCGCACCCG 


2040 


CTGATGGTTC 


TCGCGCGACA 


AVV9 111 O X wO 


ill O^Vv» kdtwj 


GTCATCGCAC 


TCCTGCTCGG 


2100 


GTGAAGCGGC 


GGTGCCGGTC 


ATCGAAGAGC 


CGCAGAAATT 


GCTGAAGAAA 


GCGTTGGCCG 


2160 


GCGGCGGTGC 


TTGGGCGAAC 


ACGAACTGCG 


GTGGCGGGGG 


CACGACGGTA 


ACAGCCCAGG 


2220 


AAAACGGCGC 


CAAGAACCAC 


GTCATGTCAG 


AGCGAAAGCG 


CCGGGAGAAG 


CTCAACGAGA 


2280 


TGTTCCTCGT 


TCTCAAGTCG 


TTGGTTCCCT 


CCATTCACAA 


GGTGGACAAA 


GCATCCATCC 


2340 


TCGCCGAAAC 


GATAGCCTAT 


CTAAAGGAGC 


TTCAACGAAG 


GGTACAAGAA 


CTGGAATCCA 


2400 


GGAGGCAAGG 


; TGGCAGTGGG 


TGTGTCAGCA 


AGAAAGTCTG 


TGTGGGCTCC 


AACTCCAAGA 


2460 
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GGAAGAGCCC AGAGTTCGCC GGTGGCGCGA AGGAGCACCC CTGGGTCCTC CCCATGGACG 2520 
GCACCAGCAA CGTCACCGTC ACCGTCTCGG ACAC GAACGT GCTCCTGGAG GTGCAATGCC 2580 
GGTGGGAGAA GCTCCTGATG ACACGGGTGT TCGACGCCAT CAAGAGCCTC CATTTGGACG 2640 
CTCTCTCGGT TCAGGCTTCG GCACCAGATG GCTTCATGAG GCTCAAGATA GGAGCTCAGT 2700 
TTGCAGGCTC CGGCGCCGTC GTGCCCGGAA TGATCAGCCA ATCTCTTCGT AAAGCTATAG 2760 
GGAAGCGATG AAAGGGCGCT ACATGTGAAG CTTAATTAAT GGAAGCAAAC TTGTATTTCT 2820 
TGTGCAAAAG CTTACTATAT ATTTCTGCAA AACCTGGTGT GCCTTGTTTT GATTTTCAGT 2880 
CGCCAATTGT GCCTTTGTTT TTATCAAGTG ATGATCTACA CTATATATAT GGAATATTTG 2940 
AAAAGAGCGA TGTCATAGGG TTTTTTTATT ACAAGGAACA AGTCTTTCAC GTGCTGGCCT 3000 
CACAAATCCA AGAGAAAATC TGCTCATTTT GATTGGCTTC CGCAACAACT CTGTAATCCA 3060 
TATCCTTTGT ATCCGATCAA CTATGATACC TCCTCCCCCA TCTCTTTTTT TTTTATCTGC ' 3120 
ACAATCTTCT ATTCTACTAT AATGAAACAA TAGAGCCACT ACCGAATATT TCCTCAAAAA 3180 
TGTACAACAA ACTAGGGTGG TCCAAACAAA TGCCTAGAGG AGCTAGATTC TCTTAAATTA 3240 
GACATCGGTT TCTTTTATCT CTTCCAGAAG GGATAAAAGT ATGTGTTTAT GGTCTTCAGT 3300 
AATACATTGT TCGTTTCTCA TAGTCAATTT AGAGGTGTTT AAATGTACTT GAACTAATAG 3360 
TTAGTTGGTT TAAAAATTAC TATTAAAATT AGTTAGTTAA TAAATAGCTA GCTAAATATT 3420 
AGCTAATTTG TCAAAAGTAG CTAATAGCTG AATTATTAGC TATATTGTTT TGATGTCTTC 
AGCTAATTTT AGCAGATCAT TATTAGTTCT AGTGTATCTA AACACACCCT TAGTCAAACA 
TGGTAAAAAA AAAGTTGATT CACTCATTGC TCATCGAAGA CGCAGATCAT GGCATCCCTC 
ACACGTTCTT CAGCCTACAC GGCACTTGCA TTGTAATTGC ATCTCATCTC ATCAACCCTT 
GTTGTGCATT ACTTGCCACA TGCGCCATCA ATTAACATTT TTTTGTCTCG TTCCTGAATT 
TCCTAACAAA TTTCATCAAA TGTACGCAGA GCTAAAGCTA GCTGTCGATG TCAGTTGACA 
GTTGACACCG ATGAATTTTA GAAAATTTAG TGTAAAGTAC TATTTATAAT GTTCATGACA 
CCCATATAAA ATATGTTGAC ACCGGCAAAC CTCAAGGCTA GCTTCGCCCC TGCCATCAAC 
CTTACAT CT A CATTCACCAC GAGGTGTGCA CGGCCTAGGT TCGACTCCTA TGTCATGCCT 
TGCTATCTAC AGATTCAGCA AGTGTTGTGT TCCTTGTTGT CACAATCTAC CTTTATTATA 
AAATTGATGT CATATCATGC CAAACAACAA ATAATTAATA TCGTGTGAAA TTTGAATTTC 
TCTAACATGC TCAACCAACC TTACCCCTTC ACGGTCGACC TGCAGGCATG CAAGCTT 



3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4137 



SDOCID: <WO_9S34634A2J_> 



SUBSTITUTE SHEET (RULE 26) 



WO 95/34634 



PCT/EP95/02157 



86 



SEQ. ID No. 7 



actagtacctgtcgcgcgcccatgcgcgcgtggcgtgcttctcgccctggtaactgttctcggcaaatga 

ctatttccaagtaaacatattcaatgattttgctattcttagcaaagtaatttcacttggacttttgtgc 

caaaaacgcattggaaaaaatctecttggactccagcctaaggttgaaagtgfcaaaaactgggaaaaatt 

attgatgtttcgggcagttacttggctatgtaaattccatacctttt:cGaaatatcct.aaacat:tcfcttt 

ctgtttctgcaacatacatgtttatcagttctggacctttgacgctacgaaagttcagtgagtattcagg 

ctttcgcaagtaaaacctagaagtccaacggacattcattttagcgatcccatguctttaggatgcactt 

gttatcggatgtctcctatgagacagaatgcacttgttatggtaactaaacaaaaaaatataatttaatt 

cgtgtgaaactttttcaaacctaccttccctgttcccggaggtccatatacccagacacctaatcgcttg 

cgcaatttagaagaaatcatgcgattatacgtcaaagggagctgaaat-atceiagcaaaagaaaaggtcat 

cccacaaaagcccaaaactattgtagggaaaacacttgttttacctataattgagcgtcgtattggtgtt 

gctgatatttactgctaaaccaagtccaatttaccagaatagtatctagaagaatccttttcacatcctc 

tagcccgccaacatcctaccatttgacattgagaactaaaaaacaaattgttcccagacgaaagctaaag 

tcgctttatacgattagctgcagtaggtgagcacgatctccgaacgctgggcatgacacgaccatgatag 

acgacatggacattttgtcaaacacctgcatggcgtcaccagggaaaacaatccagcaggagagttggga 

gagagatggaaacaattaattatgcaaacacggaggagacacaatttgaagagtgttcgtacacctacgg 

caatcagcgaaacgatgagagagcataccaagctcgggtcgtcagacacgcgrgaggacggacggtggcac 

cgatggagatggagacagttgcgtgccgtttftttgtggagggcttcgttggtgtcgggcgtcggcggagc 

ctgaacgcggfcgggaagaagagcggcgtggtgggaagaagagcgacgtcaggttctagactattcttgtg 

gcctcgggcggatggcgggtacccatgtcttcgttaggcttatctgaccgtggagatgaaatctaacggc 

tcatagaaattaaactaacgtggactcccagacgaaagctaaagtcgctttatacgatcagctgcagtag 

gtgagcacgatctccgaacgctgggcatgacacgaccatgatagacgacatggacattttgticaaacacc 

tgcatggcgtcaccagggaaaacaatccagcaggagagttgggagagagatggaaacaattaattatgca 

aacacggaggagacacaatttgaagagtgttcgtacacctacggcaafccagcgaaacgafcgagagagcat 

accaagctcgggtcgtcagcaacgcggaggacggacggtggcacegatggagatggagacagttgcgtgc 

cgttttttgtggagggcttcgttggtgtcgggcgtcggcggagcctgaacgcggtgggaagaagagcttc 

gtggtgggaagaagagcgacgacaggttctagactattcttgtggcctcgggcggatggcgggtacccat 

gtcttcgttaggcttatctgaccgtggagatgaaatctaacggctcatagaaattaaactaacgtggaca 

ctctgtccttgctgttttgctccctgctctttatatatagaatgcctgcttgcattgcacccgtacgtac 

agcgtagcgcggagtggaggtgagctcctcctccgattcttgcctaatctttggtctttgcacacgtacg 

aaagctttttgcattgtttcgttgcttctggatgatcagtactcttagatattaagcgataccgatctag 

aatcgagttgttgtactctctctgtcccttttgtgcagctataactagctaggttccttcgcatagagcc 

tctctacagagtacagactagctagcagtgtcagacacgaaatggaaatggtcacttccaaattgcacga 

gctggaattatatactcttctgatcttcttcaccgtctctttatagcgtgatatgcgtttctggcttctt 

gcttacgtgaaggattattagtaaggcgcgtgatggcgctctcagcttccccggcccaggaagaactgct 

gcagcctgctgggaggccgttgaggaagcagcttgctgcagccgcgaggagcatcaactggagctatgcc 

ctcttctggtccatttcaagcactcaacgacctcggtaaatggaagtcctgataatctataatttgtctg 

gcagttttctacaactctggtgaatgatcgtcacttcgtttgcctgatacatacatacatacatatgaaa 

taaagaaagtcggatcccgtgatgcgattgcagttatcgcttttccgcaaaatggttgctttttgaatct 

gel 
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CLAIMS 

1. A plant consisting essentially of cells which comprise 
in their genome: 

- a homozygous male-sterility genotype at a first genetic 
locus; and 

- a color-linked restorer genotype at a second genetic 
locus, which is heterozygous (Rf/-)for a foreign DNA Rf 
comprising: 

a) a fertility-restorer gene capable of preventing the 
phenotypic expression of said male-sterility 
genotype, and 

b) at least one anthocyanin regulatory gene involved in 
the regulation of anthocyanin biosynthesis in cells 
of seeds of said plant which is capable of producing 
anthocyanin at least in the seeds of said plant, so 
that anthocyanin production in the seeds is visible 
externally. 

2. The plant of claim 1 in which said color gene is 
capable of producing anthocyanin at least in the aleurone 
of the seeds of said plant. 
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3. The plant of claim 1, in which said first genetic 
locus is homozygous for a foreign RNA S (S/S) which 
comprises a male-sterility gene which when generated in 
cells of the plant renders the plant male-sterile without 
otherwise substantially affecting the growth and 
development of the plant. 

4. The plant of claim 1, in which said first genetic 
locus is homozygous for a foreign DNA S (S/S) which 
comprises a male-sterility gene which comprises: 

si) a male-sterility DNA encoding a RNA, protein or 
polypeptide which, when produced or overproduced in a 
stamen cell of the plant, significantly disturbs the 
metabolism, functioning and/or development of said 
cell, and, 

s2) a sterility promoter capable of directing expression 
of the male-sterility DNA selectively in the stamen 
cells, preferably the tapetum cells, of the plant; 
the male-sterility DNA being in the same 
transcriptional unit as, and under the control of, 
the sterility promoter, 
and in which said fertility restorer gene in said second 

genetic locus comprises at least: 
al) a fertility-restorer DNA encoding a restorer RNA, 
protein or polypeptide which, when produced or 
overproduced in the same cell as said male-sterility 
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gene S, prevents the phenotypic expression of S, 
and, 

a2) a restorer promoter capable of directing expression 
of the fertility-restorer DNA at least in the same 
cells in which said male-sterility gene is 
expressed, so that the phenotypic expression of said 
male-sterility gene is prevented; the fertility- 
restorer DNA being in the same transcriptional unit 
as, and under the control of, the restorer promoter. 

5. The plant of claim 1 in which said male-sterility DNA 
encodes barnase and in which said fertility restorer DNA 
encodes bar star. 



15 6. The plant of claim 1 in which the sterility promoter 

and/or the restorer promoter is selected from the group 
consisting of PTA29, PCA55, PT72, PT42, and PEL 

7. The plant of claim 1 in which the homozygous male- 
20 steriity genotype is endogenous and is homozygous for a 

recessive allele m (m/m) and in which the fertility 
restorer gene is the dominant allele M of said endogenous 
male-sterility genotype. 

25 8. The plant of claim 1, which is a cereal plant which is 

selected from the group consisting of corn, wheat, and 
rice. 
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9. The plant according to claim 1 wherein said 
anthocyanin regulatory gene is a shortened R, B or CI 
gene or a combination of shortened R, B or CI genes which 

5 is functional for conditioning and regulating anthocyanin 

production in the aleurone. 

10. The plant according to claim 9 wherein said 
anthocyanin regulatory gene is selected from the group 
consisting of a shortened CI or Cl-S gene having a 
nucleotide sequence corresponding to the sequence between 
positions 447 and 2418 of SEQ ID No. 1, a shortened B- 
peru gene having a nucleotide sequence corresponding to 
the sequence between positions 1 and 3 272 of SEQ of ID 
NO. 6; and the Eco-Sall fragment having a length of about 
4 aaa bp of pCOL13 . 

11. The plant according to claim 10 wherein said 
anthocyanin regulartory gene does not contain any 

20 introns . 

12. The plant according to claim 9 wherein said 
anthocyanin regulatory gene comprises a shortened CI or 
Cl-S gene and a shortened B-peru gene. 

25 

13. The plant according to claim 9 wherein said 
anthocyanin regulatory gene is a chimaric DNA comprising 
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a coding region of an R or B gene and/ or CI gene operably 
linked to an aleurone-specif ic promoter. 

14. The plant according to claim 13 wherein said 
5 aleurone-specif ic promoter is selected from the group 

consisting of: the sequence between positions 1 to 1077 
or 447 to 1077 of SEQ ID No. 1, and the sequence between 
positions 1-575 of sequence ID No. 6. 

10 15. The plant according to claim 14, wherein said 

aleurone-specif ic promoter is selected from the group 
consisting of: the sequence between positions 1 to 1061 
or 447 to 1061 of SEQ ID No. 1, and the sequence between 
positions 1 to 188 of SEQ ID No. 6. 

15 

16. A DNA comprising an anthocyanin regulatory gene which 
is a shortened R, B or CI gene or a combination of 
shortened R, B or CI genes which is functional for 
conditioning and regulating anthocyanin production in the 

20 aleurone. 

17. A DNA according to claim 16, which comprises a 
shortened CI or Cl-S gene and a shortened B-peru gene. 

25 18. A DNA according to claim 16, which comprises at least 

one gene selected from the group consisting of a 
shortened B-peru gene having a nucleotide sequence 

SUBSTITUTE SHEET (RULE 26) 

INSDOCID: <WQ 9534634A2 J > 



WO 95/34634 PCT7EP95/02157 

92 

corresponding to the sequence between positions 1 and 
3272 of SEQ ID No. 6, a shortened B-peru gene which is 
the EcoRI-Sall fragment with a length of about 4 000 bp 
of pCOL13 and the shortened CI or Cl-S gene having a 
5 nucleotide sequence corresponding to the sequence between 

positions 447 and 2418 SEQ ID No. 1. 

19. The DNA of claim 18 in which said shortened B-peru, 
CI or Cl-S gene is further characterized by not 

10 containing any intron. 

20. A DNA according to claim 16, wherein said shortened 
CI, Cl-S or B-peru genes are operably linked to an 
aleurone-specif ic promoter selected from the group 

15 consisting of: the sequence between positions 1 to 1077 

or 447 to 1077 of SEQ ID No. 1 and the sequence between 
positions 1-575 of ID No. 6. 

21. A DNA according to claim 19 , wherein said aleurone- 
20 specific promoter is selected from the group consisting 

of: the sequence between positions 1 to 1061 or 447 and 
1061 of SEQ ID No. 1 and the sequence between positions 1 
to 188 of SEQ. ID No. 6. 

25 2 2. A DNA according to claim 16 which further comprises a 

fertility-restorer gene capable of preventing the 
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phenotypic expression of a male-sterioity genotype in a 
plant. 

23. A DNA according to claim 22 wherein said fertility- 
5 restorer gene encodes barstar. 

24. A DNA according to claim 23 , wherein barstar is under 
the control of a promoter selected from the group 
consisting of PTA29, PCA5 5 , PT72, PT42 and PEL 

10 

25. An aleurone-specif ic promoter selected from the group 
consisting of: the sequence between positionis 1 to 1077 
or 447 to 1077 of SEQ ID No. 1 and the sequence between 
positions 1-575 of SEQ ID No. 6. 

15 

26. An aleurone-specif ic promoter selected from the group 
consisting of: the sequence between positions 1 to 1061 
or 447 and 1061 of SEQ ID No. 1 and the sequence between 
positions 1 to 188 of SEQ ID No. 6. 

20 

27. A process to maintain a line of male-sterile plants, 
which comprises the following steps: 

i) crossing 

25 
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a) a male-sterile parent plant of said line having in 
a first genetic locus, a homozygous male-sterility 
genotype and 

b) a maintainer parent plant of said line consisting 
essentially of cells which comprise, stably 
intergrated in their nuclear genome: 

a homozygous male-sterility genotype at a first 
genetic locus; and 

a colored-linked restorer genotype at a second 
genetic locus, which is heterozygous for a 
foreign DNA comprising: 

i) a fertility-restorer gene capable of 
preventing the phenotypic expression of 
said male-sterility genotype, and 

ii) at least one anthocyanin regulatory gene 
involved in the regulation of anthocyanin 
biosynthesis in cells of seeds of said 
plant which is capable of producing 
anthocyanin at least in the seeds of said 
plant, so that anthocyanin production in 
the seeds is visible externally, 

ii) obtaining the seeds from said parent plants, and 
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iii) separating on the basis of color, the seeds in which 
no anthocyanin is produced and which grow . into male- 
sterile parent plants. 

28. A process according to claim 27, wherein the genome 
of said male-sterile parent plant does not contain at 
least one anthocyanin regulatory gene necessary for the 
regulation of anthocyanin biosynthesis in the seeds of 
said plant to produce externally visible anthocyanin in 
said seeds. 

29. The process of claim 28, wherein the genome of aid 
male-sterile parent plant contains a first anthocyanin 
regulatory gene and the genome of said maintainer parent 

15 plant contains a second anthocyanin regulatory gene 

which, when present with said first anthocyanin 
regulatory gene in the genome of a plant is capable of 
conditioning the production of externally visible 
anthocyanin in seeds. 

20 

30. A process to maintain a line of maintainer plants, 
which comprises the following steps: 

i) crossing: 

25 
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a) a male-sterile parent plant of said line having, in 
a first genetic locus, a homozygous male-sterility 
genotype , and 

5 b) a maintainer parent plant of said line consisting 

essentially of cells which comprise, stably 
integrated in their nuclear genome: 

- a homozygous male-sterility genotype at a first 
10 genetic locus; and 

- a colored-linked restorer genotype at a second 
genetic locus, which is heterozygous for a 
foreign DNA comprising: 



15 



20 



25 



i) a fertility-restorer gene capable of preventing 
the phenotypic expression of said male- 
sterility genotype, and 

ii) at least one anthocyanin regulatory gene 
involved in the regulation of anthocyanin 
biosynthesis in cells of seeds of said plant 
which is capable of producing anthocyanin at 
least in the seeds of said plant, so that 
anthocyanin production in the seeds is visible 
externally. 
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ii) obtaining the seeds from said male-sterile parent 
plant, and 

iii) separating on the basis of color, the seeds in which 
5 anthocyanin is produced and which grow into 

maintainer parent plants. 

31. A process according to claim 30, wherein the genome 
of said male-sterile parent plant does not contain at 
10 least one anthocyanin regulatory gene necessary for the 

regulation of anthocyanin biosynthesis in the seeds of 
said plant to produce externally visible anthocyanin in 
said seeds. 

15 32. The process of claim 31, wherein the genome of said 

male-sterile parent plant contains a first anthocyanin 
regulatory gene and the genome of said maintainer parent 
plant contains a second anthocyanin regulatory gene 
which, when present with said first anthocyanin 

20 regulatory gene in the genome of a plant is capable of 

conditioning the production of externally visible 
anthocyanin in seeds. 

33. A kit for maintaining a line of male-sterile or 
25 maintainer plants, said kit comprising: 
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a) a male-sterile parent plant of said line having, in a 
first genetic locus, a homozygous male-sterility 
genotype, and 

5 b) a maintainer parent plant of said line consisting 

essentially of cells which comprise, integrated in 
their nuclear genome : 

- a homozygous male-sterility genotype at a first 
10 genetic locus; and 

- a colored-linked restorer genotype at a second 
genetic locus, which is heterozygous for a foreign 
DNA comprising: 

15 

i) a fertility-restorer gene capable of 
prevening the phenotypic expression of said 
male-sterility genotype, and 

20 ii) at least one anthocyanin regulatory gene 

involved in the regulation of anthocyanin 
biosynthesis in cells of seeds of said plant 
which is capable of producing anthocyanin at 
least in the seeds of said plant, so that 

25 anthocyanin production in the seeds is 

visible externally. 
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34. A process according to claim 33, wherein the genome 
of said male-sterile parent plant does not contain at 
least one anthocyanin regulatory gene necessary for the 
regulation of anthocyanin biosynthesis in the seeds of 
said plant to produce externally visible anthocyanin in 
said seeds. 

35. The process of claim 34 , wherein the genome of said 
male-sterile parent plant contains a first anthocyanin 
regulatory gene and the genome of said maintainer parent 
plant contains a second anthocyanin regulatory gene 
which, when present with said first anthocyanin 
regulatory gene in the genome of a plant is capable fo 
conditioning the production of externally visible 
anthocyanin in seeds. 

36. Process to maintain a kit according to claim 3 3 which 
comprises: 

crossing said male-sterile parent plant with said 
maintainer parent plant; 

obtaining the seeds from said male-sterile parent 
plants and optionally the seeds from said maintainer 
parent plant in which no anthocyanin is produced; and 
optionally growing said seeds into male-sterile parent 
plants and maintainer parent plants. 
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